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PREFACE 

Objective:  This  is  a  document  intended  for  those  who  require  a  specific 
knowledge  of  reliability  theory  and  principles  as  applied  to  mechanical  parts  and 
systems.  The  content  of  this  document  was  selected  based  on  a  single  criteria  -  that 
of  utility.  The  most  practical  reliability  tools  which  are  currently  being  effectively 
applied  to  assure  reliable  systems  are  discussed.  RAC  authors  have  selected  and 
organized  the  topics  contained  in  this  document  based  on  the  anticipated  needs  of 
those  who  must  understand  and  apply  the  concepts  of  reliability  to  system  design 
and  development.  It  is  our  goal  to  provide  the  reliability  practitioner  with  a  concise 
document  which  contains  only  the  most  useful  and  theoretically  accurate  tools  to 
assure  reliable  systems.  We  hope  we  have  accomplished  this  goal  in  a  manner 
which  will  provide  long  lasting  benefit  to  those  who  take  the  time  to  read  and 
understand  this  material. 

Background:  This  document  is  the  first  revision  (second  edition)  to  the  RAC 
document,  "Analysis  Techniques  For  Mechanical  Reliability”  which  was  first 
published  in  1985.  Many  enhancements  have  been  added,  especially  in  the  area  of 
repairable  system  reliability.  Many  of  the  statistical  analysis  tools  presented  in  the 
original  document  still  remain  in  this  first  revision.  Many  have  been  updated  and 
expanded  to  include  further  examples.  These  examples  aid  in  self-study  and 
promote  an  understanding  of  the  utility  of  reliability  evaluation  techniques.  The 
reader  should  feel  free  to  supplement  the  examples  provided  in  the  document  with 
his  or  her  own  unique  experiences.  In  this  way,  the  reader  can  customize  this 
document  to  specific  systems  of  personnel  interest. 

Content:  The  content  of  this  document  has  been  separated  into  four  major 
sections: 

Section  A:  Introduction  to  Reliability 

Section  B:  Fundamental  Statistical  Concepts 

Section  C:  Part  Reliability  Engineering 

Section  D:  System  Reliability  Engineering 

Within  each  major  section,  numerous  concepts  or  topics  have  been  discussed  as 
detailed  in  the  table  of  contents  for  this  document.  The  material  contained  in  each 
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section  is  intended  to  build  upon  the  material  of  the  previous  section(s).  In  order  to 
develop  well  rounded  reliability  analysis  skills,  it  is  recommended  that  this 
document  be  read  from  cover  to  cover  at  least  once  and  then  used  as  a  reference 
source  for  future  tasks.  Knowledge  or  insight  gained  from  Section  A  (Introduction) 
regarding  part  and  system  concepts /terminology  will  be  applied  throughout  Section 
C  (Part  Reliability  Engineering)  and  Section  D  (System  Reliability  Engineering) 
where  detailed  reliability  modeling  and  evaluation  techniques  are  discussed.  Much 
of  the  terminology  which  is  presented  in  Section  A  is  the  same  terminology  which 
is  accepted  and  utilized  by  others  in  the  reliability  community  and  will  hopefully  be 
familiar  to  you.  Essential  statistical  concepts  are  presented  in  Section  B.  Section  B 
provides  insight  into  the  random  variables  of  reliability  and  how  they  are  described. 
This  will  be  vital  to  understanding  other  topics  presented  throughout  this 
document. 

Acknowledgments:  The  author  wishes  to  thank  the  following  individuals  for 
both  technical  and  clerical  contributions  to  this  document: 
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•  Perry  Nichols  for  supplying  necessary  engineering  documents 
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1.0  INTRODUCTION 

Reliability  engineering  is  a  professional  discipline  which  combines  knowledge  in 
statistics  and  engineering  for  the  purpose  of  quantitatively  evaluating,  predicting, 
measuring  and  improving  the  reliability  of  products.  Reliability  engineering 
procedures  have  been  applied  to  a  vast  array  of  products,  some  of  which  include: 
machines  (of  all  types),  structures,  computer  software  and  materials  to  name  just  a 
few.  What  makes  reliability  engineering  or  any  other  engineering  discipline  usable 
for  its  practitioners  are  the  analysis  tools  which  are  generated  from  the  collective 
knowledge  assembled  within  the  discipline. 

The  material  contained  in  this  document  emphasizes  the  various  reliability 
analysis  techniques  which  are  available  to  those  who  must  evaluate,  model  or 
predict  the  reliability  of  parts  and  systems.  Although  mechanical  applications  are 
emphasized  in  this  document,  many  of  the  theories  which  are  presented  can  be 
universally  applied  to  other  functional  areas.  It  must  be  realized  that  these 
reliability  analysis  techniques  only  provide  the  means  to  an  end.  These  analysis 
tools  will  prove  useful  as  justification  for  the  design  changes,  corrective  actions  and 
planning  decisions  which  directly  improve  product  reliability.  As  reliability 
practitioners,  we  should  strive  to  provide  the  best  justification  possible  when 
recommending  design  changes  or  planning  future  activities  based  on  the  expected 
performance  of  a  product. 

With  this  in  mind,  this  document  was  developed  by  RAC  engineers  to  meet  the 
specific  goals  identified  in  Figure  1.0-1  and  to  provide  a  well  rounded  discussion  of 
both  part  and  system  reliability  analysis  tools.  The  major  sections  of  this  document 
are  the  following: 

Section  A:  Introduction  To  Reliability 

Section  B:  Fundamental  Statistical  Concepts 

Section  C:  Part  Reliability  Engineering 

Section  D:  System  Reliability  Engineering 
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_ SECTION  A:  INTRODUCTION  TO  RELIABILITY 

Goals: 

•  Establish  part  and  system  reliability  engineering  perspectives 

•  Define  a  "part" 

•  Define  a  "system" 

•  Establish  basic  terminology  associated  with  each  perspective 


1 

r 

SECTION  B:  FUNDAMENTAL  STATISTICAL  CONCEPTS 

Goals: 

•  Establish  statistical  concepts  &  terminology  pertinent  to  reliability 

•  Discuss  concepts  of:  the  random  variable,  sample,  population,  probability, 
distribution  of  a  random  variable,  and  dispersion  and  central  tendency  numerics 
associated  with  a  random  variable 

•  Identify  the  random  variables  of  reliability 


i 

V 

SECTION  C:  PART  RELIABILITY  ENGINEERING 

Goals: 

•  Discuss  mechanical  failure  mechanisms 

•  Discuss  mechanical  failure  theories  and  their  utility 

•  Establish  distribution  statistics  as  a  primary  part  failure  data  analysis 
procedure  with  emphasis  on  the  Weibull  failure  model 

•  Identify  and  discuss  the  utility  of  other  part  reliability  evaluation  techniques 
such  as  interference  analysis  and  life  assessment  models 


r 

SECTION  D:  SYSTEM  RELIABILITY  ENGINEERING 

Goals: 

•  Establish  the  point  process  approach  to  system  failure  modeling 

•  Discuss  important  point  process  models  such  as  Homogeneous  Poisson  Process 
(HPP)  and  Nonhomogeneous  Poisson  Process  (NHPP) 

•  Discuss  utility  of  failure  mode  evaluation  tools  such  as  Failure  Mode,  Effects 
and  Criticality  Analysis  (FMECA)  and  Fault  Tree  Analysis  (FTA) 

FIGURE  1.0-1:  SUMMARY  OF  DOCUMENT  GOALS 
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1.1  Reliability  Engineering  Perspectives 

Reliability  engineering  has  developed  two  principal  perspectives  toward  the 
analysis  of  reliability.  These  two  principal  perspectives  are:  part  and  system 
reliability  engineering.  Each  perspective  has  evolved  in  order  to  evaluate  different 
empirical  and  analytical  reliability  issues.  Each  also  deals  with  different  items  of 
primary  interest.  Part  reliability  is  concerned  with  the  failure  characteristics  of  the 
individual  nonrepairable  part  to  make  inferences  about  the  part  population.  System 
reliability  is  concerned  with  the  failure  characteristics  of  a  group  of  typically  different 
parts  assembled  as  a  repairable  system.  Past  history  has  shown  that  the  analysis  of 
parts  and  the  use  of  part  reliability  based  theories  has  dominated  the  reliability 
discipline.  Unfortunately,  in  some  cases,  this  practice  has  led  to  the  misapplication 
of  part  reliability  theories  to  systems.  But,  this  domination  still  exists  even  though 
many  of  the  system  reliability  theories  were  well  documented  as  far  back  as  the  mid- 
1960s  (Reference  [59]). 

It  is  essential  to  realize  that  two  principal  perspectives  exist  and  represent  paths 
of  analytical  diversity  within  the  study  of  reliability.  Each  perspective  offers  its  own 
unique  set  of  reliability  terminology  and  statistical  theories.  It  is  the  goal  of  this 
document  to  provide  the  reader  with  an  accurate  account  of  each  perspective  and  to 
focus  on  the  use  of  appropriate  terminology  and  analysis  techniques  when 
evaluating  parts  and/ or  systems. 


Reliability  analysis  techniques  and  terminology  are  not  universally  applicable 
but  are  a  function  of  which  perspective  is  in  effect;  part  or  system.  In  general,  a 
segregated  thought  process,  as  shown  in  Figure  1.1-1,  will  serve  the  novice  best 
when  developing,  interpreting,  or  communicating  reliability  information.  There  are 
numerous  examples  in  reliability  literature  where  incorrect  evaluations  have  been 
performed  because  of  confusion  and  misuse  of  reliability  analysis  techniques  and 
terminology.  Many  of  these  misapplications  could  have  been  prevented  with  a 
better  understanding  of  the  differences  between  part  and  system  evaluation 
procedures.  Examples  of  these  misapplications  are  identified  and  discussed  in 
Reference  [55]. 
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"The  Novice" 

FIGURE  1.1-1:  A  "NOVICE"  APPROACH  TO  RELIABILITY  ENGINEERING 

CONCEPTS 


Approaching  a  reliability  task  from  the  correct  perspective  is  required  to  obtain 
valid  results.  It  will  also  improve  your  ability  to  communicate  the  results  to 
colleagues  or  the  general  reliability  community.  Figure  1.1-2  illustrates  this  concept 
and  shows  efficient  lines  of  communication  among  the  various  members  of  the 
reliability  community.  Each  member  is  enjoying  a  balanced  perspective  toward  both 
part  and  system  reliability. 

Once  the  correct  perspective  has  been  established,  the  correct  terminology  can  be 
applied.  For  example,  one  collects  individual  time-to-part  failure  (TTF)  data  for 
parts  but  collects  time-between-successive  system  failures  (TBF)  data  for  a  system. 
Understanding  these  subtle  differences  in  terminology  will  improve  our  ability  to 
develop,  interpret  and  communicate  reliability  information.  Figure  1.1-3  illustrates 
some  common  terms  which  are  associated  with  either  part  or  system  reliability.  All 
of  the  topics  indicated  in  Figure  1.1-3  are  discussed  at  various  points  throughout  this 
document. 
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FIGURE  1.1-2:  INTERPRETING  AND  COMMUNICATING 
RELIABILITY  ENGINEERING  INFORMATION  WITH 
SUCCESS  -  HAVE  A  BALANCED  PERSPECTIVE 
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Time-to-Part  Failure  of  the  ith  part  (TTF  j) 
Mean  Time  to  Failure  (MTTF) 

Failure  Distributions 

Exponential  Failure  Model 

Weibull  Failure  Model 

Force  of  Mortality  (FOM)  or  Hazard  Rate, 

h(x) 

Wearout 
Order  Statistics 

(tTF(j),  TTF(2) ...  TTF(n)) 

Spacing  Between  Order  Statistics  (i.e.,  the 
time  between  ordered  TTF) 

"x"  for  the  age  of  a  part 


Time-Between  -  Successive  System  Failures 
(TBF  j )  or  Interarrival  Times 
Mean  Time  Between  Failure  (MTBF) 

Arrival  Times  (TTSF  j  )  or  time  to  the  ith 

system  failure 

Failure  Process 

Point  Process  Model 

Homogeneous  Poisson  Process  (HPP) 

Non  Homogeneous  Poisson  Process  (NHPP) 

Rate  of  Occurrence  of  Failure  (ROCOF) 

Deterioration 

Chronological  Ordering 

Laplace  Trend  Test 

"t"  for  the  age  of  a  system 


FIGURE  1.1-3:  SIGNIFICANT  RELIABILITY  TERMS  AND  THEIR  ASSOCIATION 
1.2  Functional  Categories  of  Reliability 


Reliability  has  been  segregated  into  functional  categories.  Functional  categories 
have  been  identified  in  numerous  discussions  of  reliability  and  include: 
mechanical,  electronic,  structural  and  materials  to  name  just  a  few.  These  categories 
are  significant  because  they  identify  specific  specialty  areas  within  reliability.  They 
represent  a  natural  transition  to  more  detailed  areas  of  expertise  and  are  typically 
associated  with  the  standard  engineering  disciplines  such  as  mechanical  and 
electrical  engineering.  As  illustrated  in  Figure  1.2-1,  these  functional  areas  can  be 
viewed  as  specialty  "spin-off"  areas  from  the  main  body  of  reliability  engineering. 
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Others 


FIGURE  1.2-1:  FUNCTIONAL  AREAS  OF  RELIABILITY 


The  functional  areas  of  reliability  have  been  greatly  enhanced  because  of  the 
enormous  knowledge  base  which  currently  exists  in  each  of  the  related  engineering 
disciplines.  In  order  for  reliability  models  to  be  meaningful,  engineering  variables 
must  be  transitioned  into  reliability  variables  such  that  no  engineering  theories  are 
violated.  In  this  way,  reliability  engineering  practices  will  enhance  present 
engineering  procedures. 

Mechanical  reliability  is  a  specific  functional  area  of  reliability  engineering  which 
specializes  in  the  application  of  reliability  principles  to  mechanical  parts  and 
systems  with  mechanical  parts.  Here  the  analyst  can  use  his  or  her  engineering 
expertise  of  mechanical  failure  mechanisms,  mechanical  failure  theories,  material 
properties,  stress  concentrations,  fatigue  theory,  fracture  mechanics  or  other  related 
topics  to  improve  the  reliability  of  parts  and  systems.  The  functional  area  of 
mechanical  reliability  will  be  emphasized  throughout  this  document  in  the  form  of 
mechanically  oriented  discussions  and  examples. 

1.3  Concept  of  a  Part 

Essential  to  the  understanding  of  material  contained  in  this  document  is  the 
significance  of  what  characterizes  a  "part"  and  what  characterizes  a  system  .  A 
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"part"  will  be  defined  as  a  nonrepairable  item  which  can  only  fail  once  and  is  then 
discarded.  A  part  can  be  categorized  as  a  simple  part  or  a  complex  part.  A  simple 
part  consists  of  a  single  component.  For  example,  o-rings,  belts,  bolts,  springs  or 
gears  can  be  considered  simple  parts.  A  complex  part  consists  of  more  than  one 
component.  For  example,  a  ball  bearing,  relay,  thermostat,  fuse  or  spark  plug  can  be 
considered  complex  parts  because,  upon  failure  they  are  typically  discarded,  but 
unlike  simple  parts,  do  contain  multiple  components. 


Figure  1.3-1  graphically  portrays  the  time-to-failure  (TTF)  associated  with  a  group 
of  identical  parts  tested  to  failure.  Note  how  parts  exhibit  no  life  after  first  failure 
and  are  considered  nonrepairable. 


First  and  only  part  failure 

•  -  Time  origin 

X  -  Failure  event 


FIGURE  1.3-1:  ILLUSTRATION  OF  PART  FAILURE  FOR  A 
SAMPLE  OF  N  IDENTICAL  PARTS 


Probabilistic  failure  models  (fp(x),  FP(x),  RP(x))  can  be  generated  from  a  sample 
of  identical  parts  which  have  failed  such  as  those  illustrated  in  Figure  1.3-1.  These 
probabilistic  models  define  the  reliability  characteristics  of  each  part  in  the  sample. 
As  the  sample  size,  n,  approaches  infinity  (°°),  the  reliability  characteristics  approach 
their  true  values.  The  true  values  being  representative  of  the  entire  part 
population.  The  probabilistic  models  associated  with  TTF  data  are  discussed  in 
Section  B  and  the  procedures  for  probabilistic  modeling  of  TTF  data  are  discussed  in 
detail  in  Section  6.0. 


1.4  Concept  of  a  System 

Unlike  a  part,  a  system  has  a  particular  characteristic  which  makes  it  unique, 
namely,  the  ability  to  experience  successive  failure  events  over  its  lifetime.  This 
sequential  series  of  failure  events  can  be  illustrated  on  a  continuous  time  line  as 
shown  in  Figure  1.4-1  and  is  called  a  failure  process. 
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Time  Origin  Failure  Event  #1 

X - 

1 

Failure  Event  #2 


Failure  Event  #3 


1 


t 


Failure  Event  #4 


FIGURE  1.4-1:  SYSTEM  FAILURE  PROCESS:  TIME  LINE  OF 
SYSTEM  FAILURE  EVENTS 


The  time  line  is  typically  representative  of  system  operating  time.  Comparing 
Figure  1.3-1  and  Figure  1.4-1  reveals  the  differences  between  part  failures  and  system 
failures,  namely,  a  part  can  only  fail  once  but  a  system  can  fail  numerous  times. 
This  significant  difference  will  form  the  basis  for  further  statistical  pursuit  within 
each  of  these  areas. 

The  notation  given  to  the  various  time  segments  of  the  system  failure  process  is 
identified  in  Figure  1.4-2. 


TTSFj  =  time  to  system  failure  events  or  arrival  times 
FIGURE  1.4-2:  TIME  SEGMENT  NOTATION  FOR  SYSTEM  FAILURE  PROCESS 

Other  references  on  repairable  system  reliability  and  time  series  analysis  have 
used  other  notation  for  both  arrival  times  (e.g.,  Tt  and  Xj)  and  interarrival  times 
(ex.,  Xj ).  This  document  has  selected  a  slightly  more  representative  set  of  notation, 
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as  illustrated  in  Figure  1.4-2,  to  improve  the  distinction  between  part  and  system 
variables.  This  essential  notation  will  also  be  summarized  in  the  next  section. 


1.5  Essential  Notation 


The  following  is  a  list  which  summarizes  the  most  significant  notation  to  be 
used  throughout  this  document.  Some  of  the  notation  has  already  been  introduced 
in  previous  sections  and  should  be  reviewed  again  at  this  time.  Much  of  this 
terminology  is  consistent  with  other  published  works  and  this  will  hopefully 
improve  the  utility  of  the  material  to  follow.  A  glossary  of  reliability  terms  is  also 
provided  in  Appendix  A. 

Parts 


P 

{TTF!,  TTF2  ...  TTFn} 
TTF(1),  TTF(2)  ...  TTF(n) 

TTF(i+1)  -  TTF(i) 
x 

fp(x) 

FP(x) 

hP(x) 

RP(x) 


represents  the  random  variable:  time  to  part  failure, 
a  primary  random  variable  in  part  reliability 
engineering 

set  of  time  to  part  failure  data  from  n  identical  parts 
time  to  failure  data  from  n  identical  parts  which  is 
ordered  by  magnitude  such  that 
TTF(d  <  TTF(2) ...  <  TTF(n)  (i.e.,  order  statistics) 
spacing  between  ordered  time  to  part  failure  data 
age  of  a  part,  typically  considered  the  operational  age 
density  function  which  describes  a  set  of  time  to  part 
failure  data  from  n  identical  parts  where  n  — ►  oo 
cumulative  distribution  function  which  describes  a 
set  of  time  to  part  failure  data  from  n  identical  parts 
where  n  — ►  oo 

hazard  rate  or  force  of  mortality  (FOM)  of  time  to 
part  failure 

part  reliability  or  probability  that  the  part  survives  to 
agex 
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Systems 


Si 


TBFi 


Ti 

TTSFj 


t 


TBF!,  TBF2  ...  TBFn 


General 


f(x) 


represents  a  system  random  variable  which  is  the 
time  between  the  (i-l)st  and  ith  system  failures 
extracted  from  a  population  of  system  failure 
processes 

time  between  the  (i-l)st  and  ith  system  failures,  also 
the  ith  interarrival  time  of  one  system  failure 
process 

represents  a  system  random  variable  which  is  the 
time  to  the  ith  system  failure  or  the  ith  arrival  time 
time  to  the  ith  system  failure  also  the  ith  arrival 
time  from  one  system  failure  process.  A  value  of 
random  variable  Tj 

total  system  age,  typically  considered  the  operational 
age 

represents  the  natural  order  of  interarrival  times  of 
one  system  failure  process  which  has  ”n”  failure 
events 


probability  density  function  of  a  random  variable,  X 


F(x) 

h(x) 

R(x) 


cumulative  distribution  function  of  a  random 
variable,  X 

hazard  rate  or  force  of  mortality  of  a  random 
variable,  X 

survivor  function  or  reliability  function  of  a  random 
variable,  X 
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2.0  INTRODUCTION  TO  STATISTICAL  CONCEPTS 

There  are  many  statistical  tools  which  are  used  within  reliability,  the  most  basic 
statistical  tools  will  be  discussed  in  this  section.  These  basic  statistical  tools  are 
provided  here  in  order  to  emphasize  their  importance  and  to  discuss  their  utility 
with  respect  to  evaluating  parts  or  systems.  Even  those  people  with  superficial 
involvement  in  reliability  issues  should  be  knowledgeable  with  respect  to  the 
statistical  concepts  provided  in  this  section.  This  section  will  concentrate  on  the 
following  primary  topics: 


1.  Random  variables  within  reliability 

2.  Defining  part  and  system  reliability 

3.  Probability  functions 

4.  Continuous  statistical  distributions 

5.  Hazard  rate 

2.1  Concept  of  a  Random  Variable 

The  concept  of  a  random  variable  is  basic  to  an  understanding  of  statistical 
theories.  A  random  variable  is  very  simply  any  variable,  such  as  the  life  of  a  part, 
maximum  stress  in  a  part  or  the  time  to  the  first  critical  failure  in  a  system,  whose 
value  cannot  be  exactly  specified  for  each  element  in  the  population.  Random 
variables  must  be  represented  probabilistically  as  opposed  to  deterministically.  The 
concept  of  a  random  variable  can  be  illustrated  with  the  following  data  set  which 
represents  the  random  variable:  cycles-to-failure.  The  following  cycles-to-failure 
were  obtained  for  a  group  of  50  identical  relays  placed  on  a  life  test: 


1283 

1887 

1888 
2357 
3137 
3606 
3752 
3914 
4394 
4398 


4865 

5147 

5350 

5353 

5536 

6499 

6820 

7733 

8025 

8185 


8559 

8843 

9305 

9460 

9595 

10247 

11492 

12913 

12937 

13210 


14840 

14988 

16306 

17621 

17807 

20747 

21990 

23449 

28946 

29254 


30822 

31473 

35811 

38319 

41554 

42870 

50246 

62690 

63910 

73473 
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By  observation  of  this  sample  of  50  relays,  the  random  variable  exhibits  a  range 
from  1283  to  73473  cycles-to-failure.  When  another  relay  (from  this  population  of 
relays)  is  placed  on  life  test,  one  would  expect  the  resulting  life  to  fall  within  this 
range  if  the  initial  50  samples  were  representative  of  the  population  of  relays.  What 
we  cannot  do  is  state  the  exact  value  of  relay  life  with  certainty. 

The  majority  of  random  variables  used  in  reliability  are  termed  continuous 
random  variables.  These  are  random  variables  where  the  number  of  possible 
outcomes  is  infinite.  For  example,  suppose  we  define  a  random  variable  X,  as  the 
length  of  time  a  certain  valve  type  will  operate  properly  before  failure.  In  this 
example,  the  set  of  possible  outcomes  for  X  may  be  considered  to  be  equal  to  the 
number  of  points  on  the  positive  real  line  (0  ^  x  <  °°).  We  could  also  say  that  the  set 
of  possible  outcomes  is  the  whole  real  line  (-<*><  x  <°°),  even  though  a  negative 
value  can  never  actually  occur  when  the  random  variable  considered  is  life. 
Whenever  the  range  of  possibilities  is  infinite,  the  set  of  possible  outcomes  is  said  to 
be  continuous,  and  a  random  variable  defined  over  this  set  is  called  a  continuous 
random  variable. 

In  most  practical  engineering  problems,  continuous  random  variables  represent 
measured  data,  such  as  temperature,  time  to  failure,  stress  or  strength.  Discrete 
random  variables  represent  count,  or  attribute  data.  Count  data,  such  as  the  number 
of  successes  or  failures,  give  no  other  information  other  than  the  fact  that  the  device 
passed  or  failed.  We  are  primarily  concerned  with  continuous  random  variables. 

2.2  Random  Variables  in  Reliability 

The  primary  continuous  random  variables  which  are  significant  to  the 
evaluation  of  reliability  include  the  following: 

a)  time-to-part  failure  (P)  represents  one  part  random  variable 

b)  time-to  the  ith  system  failure  (Tj)  where  i  =  1,  ...,  n  represents  a  set  of  n 
different  system  random  variables 

c)  time-between  the  (i-l)st  and  ith  system  failures  (Sj)  where  i  =  1,  ...,  n 
represents  a  set  of  n  individual  system  random  variables 
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These  random  variables  (P,  Tj,  Sj)  are  derived  from  the  basic  empirical  failure 
data  which  is  collected  from  both  parts  and  systems.  By  observation  of  a,  b  and  c 
above,  it  should  be  noted  that  empirical  part  data  is  described  by  a  single  random 
variable  and  the  empirical  data  from  a  single  system  contains  a  sequence  of  (2  •  n, 
where  n  is  the  total  number  of  system  failure  events)  primary  random  variables.  It 
should  also  be  noted  that  other  system  random  variables  could  be  defined,  but  for 
our  purposes  will  not  be  considered  as  primary  random  variables.  For  example,  a 
random  variable  could  be  specified  which  represented  the  time  between  first  and 
fifth  system  failures. 

These  primary  random  variables  are  significant  because  of  a  very  popular 
reliability  modeling  approach  known  as  probabilistic  modeling  or  distribution 
statistics  (discussed  in  Section  B).  Probabilistic  modeling  is  effective  when  applied  to 
a  single  random  variable  whether  it  be  a  part  or  system  random  variable.  Therefore, 
probabilistic  modeling  would  be  effective  for  each  of  the  random  variables  identified 
in  a-c  above.  In  "a"  above,  a  sample  of  identical  parts  would  be  required  to  apply 
probability  modeling.  In  b  and  c,  a  sample  of  identical  systems  would  be  required  to 
perform  probability  modeling  on  the  resulting  failure  data.  From  this  discussion,  it 
has  been  identified  that  probability  modeling  is  applied  to  a  single  random  variable. 
It  is  also  important  to  identify  when  probability  modeling  should  not  be  applied. 
Probability  modeling  should  not  be  applied  to  model  the  following  because  each 
represents  a  data  set  composed  of  multiple  random  variables: 

a)  interarrival  times  (TBFj)  from  a  single  system  failure  process 

b)  arrival  times  (TTSFj)  from  a  single  system  failure  process 

c)  mixed  interarrival  times  from  a  group  of  identical  systems 

d)  mixed  arrival  times  from  a  group  of  identical  systems 

e)  failure  data  from  a  group  of  different  parts 

So,  when  a  data  set  can  be  characterized  from  (a)  through  (e)  above,  probability 
modeling  techniques  are  invalid  and  should  not  be  applied.  Further  insight  as  to 
why  probability  modeling  techniques  are  not  appropriate  for  these  types  of  data  sets 
is  provided  in  Section  4.1. 
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2.3  Describing  a  Random  Variable  Using  the  Histogram 

In  reliability,  we  are  usually  most  concerned  with  the  possibility  of  success  or 
failure  of  a  system  or  part,  and  with  the  prediction  of  such  numerics.  Often  times, 
this  must  be  done  using  empirical  data  that  was  observed  in  a  fielded  application  or 
a  development  test  program.  To  draw  inferences  concerning  the  probability  of 
failure  based  on  this  sample  data,  we  must  first  determine  how  the  random  variable 
of  interest  is  distributed.  One  way  of  making  such  determinations  when  sample 
failure  data  is  available  is  through  the  use  of  a  histogram. 

The  histogram  graphically  describes  the  frequency  distribution  of  continuous 
random  variables.  A  histogram  is  constructed  by  first  establishing  a  series  of 
intervals  or  cells  over  the  sample  range  of  the  random  variable.  A  recommended 
number  of  cells  (k)  in  the  histogram  can  be  estimated  by  using  Sturge’s  rule1 : 


k  =  1  +  3.3  logio  11  (2_1) 

where, 

n  =  sample  size 

As  an  example,  consider  the  data  in  Table  2.3-1,  which  represents  the  lifetimes  of 
40  similar  car  batteries  recorded  to  the  nearest  tenth  of  a  year.  The  car  batteries  are 
considered  complex  parts  as  discussed  earlier  in  Section  1.3  and  are  discarded  upon 
failure. 


TABLE  2.3-1:  CAR  BATTERY  LIFETIMES  (YEARS) 


2.2 

4.0 

3.6 

4.4 

3.1 

3.8 

2.9 

2.7 

3.4 

KBH 

3.0 

3.4 

3.8 

3.0 

4.6 

3.7 

2.5 

4.2 

3.5 

3.5 

2.8 

3.4 

3.8 

3.2 

3.2 

3.2 

3.6 

4.5 

3.2 

4.0 

2.0 

3.3 

4.8 

3.7 

3.3 

2.5 

3.8 

3.1 

4.1 

3.6 

1  H.A.  Sturges,  "The  Choice  of  a  Class  Interval,”  ASA,  21,  65-66  (1926). 
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The  cell  interval  for  the  above  data  set  would  be  calculated  using  Sturge’s  rule  as 
follows: 

k  =  1  +  3.3  logio  n 
k  =  1  +  3.3  log10  40 

k  =  6.3  or  7  (when  rounded  up  to  nearest  whole  number) 

Based  on  the  Sturge's  rule  calculation,  a  total  of  seven  cells  will  be  used  to 
subdivide  the  range  of  the  sample  data  set  given  in  Table  2.3-1. 

Next,  the  cell  width  is  calculated.  The  cell  width  is  determined  by  dividing  the 
range  of  the  sample  data  by  the  number  of  recommended  intervals.  For  our 
example,  this  is  (4.8  -  1.7)/7  =  0.443.  This  represents  the  recommended  cell  width. 
Usually,  we  choose  equal  widths  having  the  same  number  of  significant  digits  as  the 
observed  data.  Denoting  this  width  by  w,  let's  choose  w  =  0.5.  Starting  the  first 
interval  at  1.6,  Table  2.3-2  shows  the  range  and  failure  frequency  (F)  of  each  cell. 

Next,  the  relative  frequency  (RF)  is  calculated  by: 

jy?  _  Number  of  failures  during  a  cell  interval  ^_2) 

Total  number  of  items  failed 

The  relative  frequency  is  also  shown  on  Table  2.3-2. 


TABLE  2.3-2:  RELATIVE  FREQUENCY  DISTRIBUTION  OF  BATTERY  LIVES 


Cell  Interval 

Cell  Midpoint 

Frequency  (F) 

Relative  Frequency  (RF) 

1.6  <  x  <  2.1 

1.85 

2 

0.050 

2.1  <x<2.6 

2.35 

3 

0.075 

2.6  <  x  <  3.1 

2.85 

5 

0.125 

3.1  <  x  <  3.6 

3.35 

13 

0.325 

3.6  <  x  <  4.1 

3.85 

11 

0.275 

4.1  <  x  <  4.6 

4.35 

4 

0.100 

4.6  <  x  <  5.1 

4.85 

2 

0.050 

Note:  Battery  Lives  (years) 
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Finally,  with  the  information  provided  in  Table  2.3-2,  a  relative  frequency 
histogram  (also  called  a  proportionate  frequency  histogram)  can  be  constructed.  The 
relative  frequency  histogram  graphs  the  random  variable  by  cell  interval  along  the 
abscissa  and  relative  frequency  along  the  ordinate.  The  relative  frequency  histogram 
for  the  battery  life  data  given  in  Table  2.3-1  is  shown  in  Figure  2.3-1. 


=8: 


■  Sample  Range  - 
Batteiy  life,  years 


FIGURE  2.3-1:  RELATIVE  FREQUENCY  HISTOGRAM  FOR 
A  SAMPLE  OF  BATTERIES 


In  Figure  2.3-1,  the  bars  are  drawn  to  have  equal  width,  and  are  centered  at  the 
midpoint  of  each  class  interval.  The  height  of  each  bar  is  equal  to  the  observed 
relative  frequency  and  represents  a  proportionate  frequency  or  probability  that  a 
value  of  the  random  variable  will  occur  within  that  interval. 
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2.4  Probability  Density  Function,  f(x) 

The  relative  frequency  histogram  discussed  in  Section  2.3  is  an  approximation, 
based  on  a  limited  sample  size,  of  the  probability  distribution  of  a  random  variable. 
Stated  more  precisely,  the  limiting  form  of  a  relative  frequency  histogram,  as  the 
sample  size  approaches  infinity,  is  the  probability  distribution. 

This  is  represented  graphically  in  Figure  2.4-1.  As  the  sample  size  is  increased, 
the  histogram  approaches  a  smooth  curve,  and  the  resulting  curve  is  now  a 
function  of  the  random  variable  X,  denoted  by  fx(x)  and  termed  the  probability 
density  function.  In  general,  a  capital  letter  is  used  to  represent  the  random  variable 
and  a  lower  case  letter  is  used  to  represent  a  specific  value  of  the  random  variable. 
In  many  instances,  the  random  variable  subscript  (X)  will  be  assumed  and  the 
probability  density  function  will  be  represented  as  just  f(x). 

When  f(x)  represents  a  probability  density  function  for  a  continuous  random 
variable  X,  the  expression  f(x)dx  (a  measure  of  area)  can  be  defined  as  the  probability 
that  values  of  the  random  variable  fall  between  [x  -  (l/2)dx]  and  [x  +  (l/2)dx].  In 
general,  the  area  under  the  probability  density  function  represents  probability  as 
shown  in  Figure  2.4-2. 
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Sample 


FIGURE  2.4-1:  EFFECT  OF  INCREASED  SAMPLE  SIZE  (n)  ON  THE 
RESOLUTION  OF  THE  HISTOGRAM 
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FIGURE  2.4-2:  PROBABILITY  OF  AN  EVENT  OCCURRING  DURING 
A  SPECIFIED  INTERVAL:  aTOb 

The  probability  that  x  (a  specific  value  of  random  variable  X)  lies  in  some  finite 
range,  a  to  b,  is  generally  represented  as: 

Pr  (a  <  x  <  b)  =  fb  f(x)dx  (2-3) 

Returning  to  the  battery  example,  we  can  estimate  the  probability  density 
function,  f(x),  by  a  smooth  curve  fitted  to  the  relative  frequency  histogram  as  shown 
in  Figure  2.4-3.  The  probability  that  a  battery  fails  between  3.6  and  4.6  years  when 
selected  at  random  from  an  infinite  line  of  production  of  such  batteries  can  be 
determined  by  calculating  the  area  under  f(x)  from  3.6  to  4.6.  Mathematically,  this  is 
represented  as: 

Pr(3.6<x<4.6)=  J46  f(x)dx  (2-4) 

When  the  bounds  of  Equation  (2-3)  represent  the  extreme  limits  of  all  possible 
values  of  x,  a  characteristic  of  the  probability  density  function  is  obtained;  namely, 
the  total  area  under  the  probability  density  function  is  equal  to  one  or: 

f~  f(x)dx  =  1  (2-5) 

*  — oo 
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Battery  life,  years 


FIGURE  2.4-3:  FITTING  THE  PROBABILITY  DENSITY  FUNCTION,  f(x), 

TO  THE  RELATIVE  FREQUENCY  HISTOGRAM 

Since  probability  is  a  dimensionless  number  that  ranges  from  0  to  1,  one  can  now 
rationalize  why  f(x)  is  described  as  a  probability  function. 

All  probability  density  functions  used  as  failure  models  in  reliability  are 
completely  described  by  a  random  variable  with  range  0  to  This  premise  is  based 
on  the  rationale  that  life  cannot  be  less  than  0.  For  these  density  functions.  Equation 
(2-5)  can  be  represented  as: 

P  f(x)dx  =  1  (2-6) 

Jo 


Some  of  the  more  popular  continuous  probability  distributions,  defined  by 
probability  density  functions,  are  listed  in  Table  2.4-1.  The  random  variable  range  is 
also  defined  in  Table  2.4-1  for  each  density  function.  Each  of  these  probability 
distributions  will  be  covered  in  greater  detail  later  in  Section  2.9. 
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TABLE  2.4-1:  POPULAR  CONTINUOUS  PROBABILITY  DISTRIBUTIONS 


Probability 

Distribution 


Probability  Density 
Function,  f(x) 


Variate 
Range,  x 


Exponential 

Weibull* 

Normal 

Log-Normal 


f(x)  =  A,exp(-A,x) 


f(x)  =  \  (x  -  xj 
ap 


exp 


x  -  xc 
a 


f(x)  = 


f(x)  = 


a-\/27c 


exp 


oxV^jc 


exp 


^(x  -  n)n 

2a2 


/(lnx-|i)2N' 


0  <  X  <  oo 


0  <  X  <  oo 


-oo  <  x  < 


0  <  X  <  oo 


*Note,  other  mathematically  equivalent  forms  are  also  available. 


2.5  Relative  Cumulative  Frequency  Polygon 

The  relative  cumulative  frequency  polygon  (also  called  the  proportionate 
cumulative  frequency  polygon)  is  an  important  way  of  representing  cumulative 
frequency  data.  The  relative  cumulative  frequency  polygon  is  obtained  by  plotting 
the  random  variable  on  the  abscissa  and  the  relative  cumulative  frequency  on  the 
ordinate.  The  cumulative  plot  is  useful  for  reading  various  values  at  a  glance.  For 
instance,  what  percent  of  batteries  from  the  example  in  Section  2.3  will  fail  during 
the  first  3.6  years?  To  determine  this,  we  can  plot  the  relative  cumulative  frequency 
polygon  of  X  (the  life  of  the  car  battery).  The  relative  cumulative  frequency  data  is 
summarized  in  Table  2.5-1,  being  derived  from  the  original  data  of  Table  2.3-1.  Next, 
we  plot  the  relative  cumulative  frequency  against  the  upper  cell  boundary  as  shown 
in  Figure  2.5-1.  We  can  now  quickly  see  from  Figure  2.5-1  that  57%  of  the  batteries 
will  fail  during  the  first  3.6  years. 
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TABLE  2.5-1:  RELATIVE  CUMULATIVE  FREQUENCY  DATA 


Cell  Boundary 

Cumulative  Failure 
Frequency 

Relative  Cumulative 
Frequency 

x  <  1.6 

0 

.0 

x  <  2.1 

2 

.050 

x  <  2.6 

5 

.125 

x  <  3.1 

10 

.250 

x  <  3.6 

23 

.575 

x  <  4.1 

34 

.850 

x  <  4.6 

38 

.950 

x  <  5.1 

40 

1.00 

Battery  Life  (Years) 


FIGURE  2.5-1:  RELATIVE  CUMULATIVE  FREQUENCY  POLYGON 

From  the  relative  cumulative  frequency  polygon,  we  can  estimate  the 
cumulative  distribution  function,  F(x),  by  fitting  a  smooth  curve  to  the  data  points 
as  shown  in  Figure  2.5-2.  As  the  sample  size  approaches  infinity,  the  limiting  form 
of  the  relative  cumulative  frequency  polygon  is  the  cumulative  distribution 
function  F(x).  The  cumulative  distribution  function,  F(x),  will  be  considered  in 
more  detail  in  Section  2.6. 
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FIGURE  2.5-2:  CUMULATIVE  DISTRIBUTION  FUNCTION,  F(x) 

2.6  Cumulative  Distribution  Function,  F(x) 

The  cumulative  distribution  function,  F(x),  of  a  continuous  random  variable  X, 
having  a  probability  density  function,  f(x),  is  given  as: 

F(x)  =  Pr  (x  <  a)  =  Ja  f(x)dx  (2-7) 

The  cumulative  distribution  function  is  typically  used  to  determine  the 
probability  that  a  random  variable  is  not  greater  than  a  specified  value. 

A  more  particular  form  of  Equation  (2-7)  arises  when  the  range  considered  is 
from  the  lower  limit  of  the  random  variable,  which  is  often  zero  (refer  to  Table  2.4- 
1),  to  some  specific  value,  a,  of  the  random  variable.  Thus,  the  cumulative 
distribution  function,  F(x),  or  the  probability  that  the  random  variable  is  not  greater 
than  a  specific  value,  a,  is: 

Pr(0  <  x  <  a)  =  Ja  f(x)dx  (2-8) 

The  cumulative  distribution  function  F(x)  can  then  be  expressed  as: 

F(x)  =  fx  f(x)dx  (2-9) 

Jo 
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When  the  random  variable  is  time-to-failure  or  life.  Equation  (2-9)  represents  the 
probability  of  failure  prior  to  some  time  x  as  shown  in  Figure  2.6-1. 


FIGURE  2.6-1:  PROBABILITY  OF  FAILURE  AS  REPRESENTED  BY 
THE  AREA  UNDER  THE  PROBABILITY  DENSITY  FUNCTION 


The  cumulative  distribution  function  can  also  be  used  to  evaluate  the  probability 
of  an  event  occurring  in  a  specified  range,  a  <  x  <  b,  by: 

Pr(a  <  x  <  b)  =  F(b)  -  F(a)  (2-10) 

Furthermore,  the  derivative  of  the  cumulative  distribution  function  is  the 
probability  density  function,  or: 

«*)-«*> 

dx  (2-H) 

A  simple  example  is  now  presented  to  show  the  utility  of  the  probability  density 
function  and  cumulative  distribution  function  (Reference  [3]). 
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Example  A.  Given  the  following  probability  density  function: 


f(x)  = 


3  ,  -1  <  x  <  2 
0  ,  elsewhere 


1.  Verify  that  the  total  area  under  the  curve  is  1. 

2.  Find  Pr  (0  <  x  <  2) 

Solution  A: 


1. 

f~  f(x)dx  =  f 

x2  dx  -  — 

J-oo  J. 

-l  3  9 

2 

-l 


2.  Pr  (0  <  x  <  2)  =  J2  dx  = 

Jo  3  9 


®  +1-1 

9  9 


=  -  or  89% 
o  9 


Example  B.  For  the  probability  density  function  above,  find  F(x)  and  evaluate 
Pr(0  <  x  <  2) 

Solution  B: 


F  (x)  =  Jx  f(x)  dx  =  J 


x  x 

-l  3 


x 

~9 


x3  +  1 


Pr  (0  <  x  <  2)  =  F(2)  -  F(0)  = 


9  1  8 


9  9  9 

which  agrees  with  solution  A  above. 
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2.7  Survivor  or  Reliability  Function,  R(x) 

The  reliability  function  R(x)  which  is  defined  as  the  probability  of  a  device  not 
failing  prior  to  some  time,  x,  is  given  by: 


R(x)  =  1  -  F(x)  =  1  -  r  f(x)dx 

Jo 

(2-12) 

R(x)  =  f  f(x)dx 

J  X 

(2-13) 

In  other  words,  R(x)  is  the  probability  of  survival  at  time  x  given  failure  has  not 
occurred  prior  to  x. 

The  application  of  the  survivor  function,  R(x),  to  describe  the  reliability  of  a  part 
population  can  be  done  simply  by  substituting  the  appropriate  density  function  (i.e., 
failure  model)  into  Equation  (2-13).  The  procedure  becomes  conceptually  more 
difficult  when  we  require  a  reliability  statement  for  a  system  since  a  system  exhibits 
a  sequence  of  failures. 

The  question  arises  of  how  to  postulate  a  survivor  function  for  a  system.  This 
problem  will  be  discussed  in  more  detail  in  Section  D.  But  for  now,  it  will  help  to 
interpret  the  survivor  function  of  Equation  (2-13)  as: 

R(0,  x)  =  J~  f(x)dx  (2-14) 


This  improves  our  interpretation  of  the  survivor  function  or  reliability  function 
to  emphasize  that  reliability  is  defined  over  a  specified  time  interval.  In  the  case  of 
parts,  reliability  is  defined  as  probability  of  no  failure  in  the  interval  0  to  some  time 
x. 


The  characteristics  of  the  reliability  function  are:  (1)  at  time  zero  the  reliability 
function  is  one  and  (2)  as  time  approaches  infinity,  the  reliability  function 
approaches  zero.  These  characteristics  are  illustrated  in  Figure  2.7-1.  The  reliability 
expressed  in  Equation  (2-13)  is  illustrated  in  Figure  2.7-2. 
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Reliability 

W 

Function 

FIGURE  2.7-1:  GENERAL  RELIABILITY  FUNCTION 


f(x)  Probability 
Density 
Function 


R(x)  =  f(x)dx 


FIGURE  2.7-2:  RELIABILITY  REPRESENTED  BY  THE  AREA  UNDER 
THE  PROBABILITY  DENSITY  FUNCTION 
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Differentiation  of  Equation  (2-12)  yields: 


dR(x) 


dF(x) 


=  -f(x) 


dx  dx 

Equation  (2-15)  will  be  used  in  Section  2.8  to  derive  the  hazard  rate. 


(2-15) 


2.8  Hazard  Rate.  h(x) 


The  hazard  rate,  h(x),  or  the  force  of  mortality  (FOM)  is  a  conditional  expression 
that  gives  the  probability  that  a  device  already  in  service  for  time  x  will  fail  in  the 
next  instant  of  time,  dx,  given  that  it  has  not  failed  previously.  Thus,  it  essentially 
shows  how  the  risk  of  failure  changes  over  time. 

Let  F(x)  be  the  cumulative  distribution  function  of  the  time-to-failure  random 
variable  X,  and  let  f(x)  be  its  probability  density  function.  Then  the  hazard  function, 
h(x),  can  be  derived  as  follows: 


The  probability  of  failure  in  a  given  time  interval  between  x  and  x  +  Ax  can  be 
expressed  as: 

f  f(x)dx  -  f°°  f(x)dx  =  R(x)  -  R(x  +  Ax)  (2-16) 

Jx  +  Ax 


where  R(x)  is  the  reliability  or  probability  of  survival  at  time  x. 


The  risk  of  failure,  X(x),  in  the  interval  x  to  x  +  Ax  is  defined  as  the  probability 
that  failure  occurs  in  the  interval  (Equation  (2-16)),  divided  by  the  product  of  the 
probability  that  failure  does  not  occur  prior  to  the  start  of  the  interval  (Equation  (2- 
13))  and  the  interval  length,  or 


X(x)  = 


R(x)  -  R(x  +  Ax) 
R(x)  Ax 


(2-17) 


The  hazard  rate,  h(x),  or  force  of  motality  is  defined  as  the  limit  of  X.(x)  as  the 
interval  length  approaches  zero,  or 
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h(x)  =  ax^O 


R(x)  -  R(x  +  Ax) 


or 


R(x)  Ax 

R(x)  -  R(x  +  Ax) 
— 


(2-18) 

(2-19) 


Using  the  definition  of  a  derivative.  Equation  (2-19)  becomes: 


It  was  shown  previously  in  Equation  (2-15)  that 


(2-20) 


f(x)  =  ~dR(x)  (2-21) 

dx 

Substitution  of  Equation  (2-21)  into  Equation  (2-20)  yields  the  definition  of  the 
hazard  function: 


h(x)  = 


f(x) 

R(x) 


f(x) 

l-F(x) 


(2-22) 


The  hazard  function  is  one  of  the  fundamental  relationships  important  in 
reliability  analysis.  It  is  important  to  note  that  even  though  the  hazard  rate  is 
defined  by  probability  functions,  the  hazard  rate  is  not  a  probability  function. 

The  reliability  function,  R(x),  can  be  derived  strictly  in  terms  of  the  hazard  rate 
function,  h(x),  as  follows. 


Equation  (2-22)  is  expressed  as: 

f(x)  =  h(x)  [l-F(x)]  (2-23) 

The  cumulative  distribution  function  was  previously  defined  as: 


F(x)  =  P  f(x)dx 
Jo 


(2-24) 
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The  derivative  of  the  cumulative  distribution  function  is: 


dF(x) 

dx 


=  f(x) 


Substituting  Equation  (2-23)  into  Equation  (2-25)  yields: 
dF(x) 


dx 


=  h(x)  [l-F(x)] 


or. 


h(x)dx  = 


d  F(x) 
l-F(x) 


Integrating  Equation  (2-27)  yields: 


f  h(x)dx  =  f 
•'o  Jo  1- 


dF(x) 


F(x) 


(2-25) 


(2-26) 


(2-27) 


(2-28) 


fx  h(x)dx  =  -  In  [1  -  F(x)] 

•'n 


(2-29) 


fx  h(x)dx  =  -  In  [1  -  F(x)]  +  In  [1  -  F(0)] 

J  n 


(2-30) 


Since  F(0)  =  0,  Equation  (2-30)  reduces  to: 
fx  h(x)dx  =  -  In  [1  -  F(x)] 


(2-31) 


The  reliability  function,  R(x),  expressed  in  terms  of  the  hazard  function  is  then 
expressed  as: 


R(x)  =  l-F(x)  =  exp 


-  Jx  h(x)dxj 


(2-32) 


Using  Equations  (2-32)  and  (2-22)  the  probability  density  function,  f(x),  can  also  be 
expressed  entirely  in  terms  of  the  hazard  function. 


f(x)  =  h(x)  exp  j^-Jx  h(x)dxj 


(2-33) 
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A  summary  of  the  important  relationships  involving  the  hazard  function  is 
provided  in  Table  2.8-1. 

TABLE  2.8-1:  SUMMARY  OF  IMPORTANT  RELATIONSHIPS 
INVOLVING  THE  HAZARD  FUNCTION 


Relationship 

Interpretation 

,  ,  \  f(x)  f(x) 

h(X>  =  R(x)  =  l-F(x) 

Hazard  rate  in  terms  of  the  survivor 
function,  R(x),  and  the  cumulative 
distribution  function,  F(x) 

R(x)  =  l-F(x)  =  exp  -fx  h(x)dx 

-  Jo 

Survivor  function,  R(x),  expressed  in 
terms  of  the  hazard  rate 

f(x)  =  h(x)  exp  -  fx  h(x)dx 

Jo 

Probability  density  function,  f(x), 
expressed  in  terms  of  the  hazard  rate 

2.9  Probability  Distributions  for  Continuous  Random  Variables 

Although  there  are  several  probability  distributions  to  choose  from,  experience 
has  shown  that  for  reliability  work,  a  relatively  small  number  of  the  distributions 
will  satisfy  most  needs.  This  section  provides  a  summary  of  the  following 
distributions  most  commonly  used  for  reliability  work. 


1)  Exponential 

2)  Weibull 

3)  Normal 

4)  Log-normal 

5)  Extreme-value 

2.9.1  Exponential  Distribution 

The  exponential  distribution  is  defined  by  the  following  probability  density 
function: 


f(x)  = 


XeXx 

0 


,  x  >  0,  X  >  0 
,  elsewhere 


(2-34) 
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The  cumulative  distribution  function  is  defined  to  be: 


F(x)  =  Jx  f(x)dx  =  Jx  Xe'^x  dx  =  -e'**  * 

— OO  0 

or, 

F(x)  =  1  -  e‘Xx  (2-35) 


In  the  past,  the  exponential  distribution  has  been  the  most  commonly  used 
distribution  in  reliability  to  model  the  failure  characteristics  of  parts.  However,  the 
widespread  use  of  the  exponential  distribution  is  not  necessarily  an  indication  that  it 
is  always  appropriate.  In  fact,  for  mechanical  parts,  it  is  usually  inappropriate.  The 
reason  for  its  popularity  lies  in  the  mathematical  simplicity  of  the  resulting 
functional  expressions  for  reliability  and  the  hazard  rate. 

The  reliability  function  for  the  exponential  distribution  is  defined  as: 


R(x)  =  1  -  F(x)  =  1  -  fx  f(x)dt  (2-36) 

Jo 

R(x)  =  e_Xx  (2-37) 

Recall  that  the  hazard  rate  is  defined  as: 


h(x)  = 


f(x) 

R(x) 


(2-38) 


Substituting  Equations  (2-37)  and  (2-34)  into  Equation  (2-38)  yields: 


h(x)  = 


Xe 


-Xt 


,-xt 


h(x)  =  X  ,  (a  constant  for  the  exponential  case) 


(2-39) 


When  time-to-failure  is  the  random  variable  of  interest,  the  parameter  X  has 
become  known  as  the  "failure  rate".  Therefore,  for  the  exponential  case,  the  hazard 
rate  is  constant  over  time.  This  means  that  regardless  of  how  many  hours  a  device 
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has  survived  (100,  500  or  1000  hours)  the  risk  of  failure  in  the  next  instant  is  the 
same.  A  constant  hazard  rate  also  characterizes  "random"  failures. 

The  simplicity  of  the  Equations  (2-37)  and  (2-39)  has  been  a  contributing  factor  to 
the  overwhelming  popularity  of  the  exponential  distribution.  Another  contributing 
factor  has  been  the  lack  of  sufficient  time-to-failure  data  that  could  be  used  to  better 
characterize  other  failure  distributions.  These  two  factors  have  led  many  engineers 
to  assume  a  constant  hazard  rate  and  an  exponential  failure  model. 

2.9.2  Weibull  Distribution 

A  distribution  that  continues  to  gain  popularity  in  reliability  work  is  the  Weibull 
distribution.  The  original  work  regarding  this  distribution  was  presented  in  a 
hallmark  paper  which  appeared  in  the  Journal  of  Applied  Mechanics  in  19512.  As 
the  paper  had  indicated,  it  has  indeed  become  "a  statistical  distribution  function  of 
wide  applicability".  The  reason  for  its  growth  in  popularity  is  its  versatility.  Many 
of  the  distributions  used  in  reliability  and  described  in  Section  2.9,  can  be  derived 
from  or  approximated  by  the  Weibull  density  function  which  is  defined  as: 


where, 

xD  =  expected  minimum  value  of  the  random  variable 
p  =  shape  parameter  or  Weibull  slope  ((3  >  0) 
a  =  scale  parameter  or  characteristic  value  (a  >  0) 

The  cumulative  distribution  function  for  the  Weibull  distribution  can  be 
derived  by  substituting  Equation  (2-40)  into  Equation  (2-9),  the  result  is: 

F(x)  =  1  -  exp  -^X  ^X°  j  ,  x  >  xG  (2-41) 

2  Weibull,  Waloddi  (1951).  "A  Statistical  Distribution  Function  of  Wide  Applicability",  Journal  of 
Applied  Mechanics,  pg.  293-297. 
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As  noted,  several  of  the  other  distributions  can  be  derived  from  the  Weibull 
distribution.  Modeling  other  distributions  can  be  accomplished  by  selecting  the 
appropriate  value  of  p,  the  Weibull  shape  parameter.  Table  2.9-1  provides  a  list  of 
distributions  and  their  corresponding  values  of  p. 

TABLE  2.9-1:  WEIBULL  SHAPE  PARAMETER,  P, 

AND  RESULTING  DISTRIBUTION 


Shape  Parameter  Value 

Corresponding  Distribution 

pel 

Gamma  (k<l) 

p=l 

Exponential 

P  =2 

Raleigh 

P  =1.5 

Log-normal  (approximate) 

P  =3.44 

Normal  (approximate) 

Equation  (2-40)  represents  a  three  parameter  Weibull  distribution.  In  the  study 
of  reliability,  where  it  is  reasonable  to  assume  that  a  lower  bound  on  life  is  zero,  the 
variable  xD  in  Equation  (2-40)  can  be  set  to  zero  and  a  two  parameter  Weibull 
distribution  is  created,  where. 


f(x)  = 


r 

JL 


,x>0,  a>0,  p>0 


0,  Elsewhere 


and 

F(x)  =  1 


(2-42) 


(2-43) 


The  characteristic  value,  a,  can  be  expressed  in  terms  of  the  cumulative 
distribution  function  by  setting  x  =  a,  then: 


F(a)  =  1  -  exp 


P 


(2-44) 


F(a)  =  1  -  exp(-l) 
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F(a)  =  .632  of  63.2% 


Therefore,  the  characteristic  value,  a,  is  that  value  at  which  63.2%  of  the 
population  will  have  failed  when  the  random  variable  considered  is  life. 


Using  Equations  (2-40)  and  (2-13),  we  can  derive  the  reliability  function  for  the 
three  parameter  Weibull  distribution  as. 


(2-45) 


and  the  Weibull  hazard  rate  is  derived  from  Equation  (2-22)  as: 


h(x)  =  £ 
a 


a 


p-l 


(2-46) 


The  Weibull  probability  density  function  and  the  hazard  rate  are  sketched  for 
five  values  of  the  shape  parameter,  (3,  in  Figure  2.9-1  and  2.9-2,  respectively.  Figure 
2.9-2  indicates  that  the  Weibull  hazard  function  is  an  increasing  function  when  (3  >  1 
and  is  independent  of  the  random  variable  when  (3=1.  When  (3  <  1  the  hazard  rate 
decreases  as  the  random  variable  increases.  This  illustrates  the  versatility  of  the 
Weibull  distribution  to  represent  a  family  of  various  distributions. 

Many  mechanical  components  are  characterized  by  an  increasing  hazard  rate  ((3  > 
1)  due  to  deterioration  or  wear.  A  decreasing  hazard  rate  (|3  <  1)  is  useful  in 
characterizing  phenomena  such  as  work  hardening  and  other  life  improvement 
processes.  The  constant  hazard  rate  (0=1)  is  generally  used  to  characterize  random 
failures  such  as  those  exhibited  by  many  electronic  components.  A  detailed 
treatment  of  the  utility  of  the  Weibull  distribution  can  be  found  in  Reference  [2]. 
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FIGURE  2.9-1:  WEIBULL  DENSITY  FUNCTIONS,  VARIOUS  SHAPES  ((3) 
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FIGURE  2.9-2:  WEIBULL  HAZARD  FUNCTIONS,  VARIOUS  SHAPES  (p) 
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2.9.3  Normal  Distribution 


The  normal  density  function  was  derived  by  Carl  F.  Gauss  in  1809  in  his 
investigations  of  the  mathematics  of  planetary  orbits.  Since  that  time,  the  normal 
distribution  has  become  one  of  the  best  known  and  most  widely  used  statistical 
distributions.  Even  though  the  normal  distribution  has  limited  use  for  analyzing 
failure  data  as  discussed  below,  it  is  very  useful  as  a  model  for  characterizing  other 
random  variables  used  in  reliability  analyses.  For  example,  the  normal  distribution 
is  frequently  used  to  describe  the  stress  or  strength  of  a  part  in  the  application  of 
interference  analysis  (refer  to  Section  3.6).  In  general,  the  normal  distribution  is  a 
good  representative  model  for  the  distribution  of  variables  from  many  naturally 
occurring  phenomena  which  are  expected  to  be  symmetric. 


The  probability  density  function  for  the  normal  distribution  is  represented  by  the 
mathematical  function: 


f(x) 


oV2^ 


exp 


-(x  -  pr 
2a2 


-»  <  X  <  «> 


where, 

f(x)  =  probability  of  occurrence  of  x 

ji  =  measure  of  central  tendency  (mean  of  the  population) 

a  =  standard  deviation 

2 

a  =  variance 


(2-47) 


The  normal  distribution  is  not  generally  utilized  as  a  failure  model  because  it 
does  not  satisfy  the  conditions  required  for  a  time-to-failure  probability  density 
function.  In  real  world  situations,  the  time-to-failure  for  both  parts  and  systems 
must  be  positive  and  the  probability  density  function  (i.e.,  failure  model)  is  then 
distributed  from  0  to  °°,  or 


J  f(x)dx  =  1 


(2-48) 
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Instead,  the  normal  distribution  satisfies: 


J  f(x)dx  =  1  (2-49) 


This  situation  can  be  rectified  by  modifying  the  normal  distribution  so  as  to  fully 
comply  with  the  conditions  for  a  time-to-failure  distribution.  The  modification 
function  has  the  form: 


g(x)  = 


f(x) 


1  —  f°  f(x)dx 

J—OO 


0  <  X  <  oo 


(2-50) 


However,  g(x)  is  no  longer  a  normal  distribution. 

As  with  the  Weibull  distribution,  the  normal  distribution  is  representative  of  a 

r\ 

family  of  distributions,  each  member  with  unique  values  of  p  and  a  in  the  case  of 
the  normal  distribution.  A  nice  feature  of  the  normal  distribution  is  that  any 
normal  distribution  can  be  transformed  to  a  standard  normal  distribution  with  a 
mean  (p)  of  zero  and  a  variance  j  of  one.  The  standard  normal  density  function 

r\ 

can  be  derived  from  Equation  (2-47)  by  setting  |i  =  0  and  o  =  1. 


f(x)  = 


(2-51) 


In  order  to  transform  any  non-standard  normal  distribution  to  a  standard  normal 
distribution,  the  following  transformation  can  be  applied: 


z 


x  -  p 
a 


where. 


(2-52) 


z  =  transformed  value  of  x 

p  =  mean  of  x 

a  =  standard  deviation  of  x 
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An  application  of  Equations  (2-51)  and  (2-52)  will  be  presented  in  Section  3.6.2  of 
this  document. 


Another  use  of  the  normal  distribution  involves  its  application  to  system 
parameter  tolerance  analysis.  A  tolerance  analysis  is  designed  to  look  at  the 
consequences  on  system  outputs,  due  to  the  simultaneous  "drift"  of  individual 
components.  Several  computer  programs  are  available  to  aid  in  this  kind  of 
analysis,  by  simulating  system  behavior  while  choosing  various  values  of 
component  parameters  from  an  assumed  normal  parameter  distribution. 

2.9.4  Log-Normal  Distribution 


Like  the  Weibull  distribution,  the  log-normal  distribution  is  a  versatile  statistical 
distribution  which  can  assume  a  range  of  shapes.  The  log-normal  does  not  have  the 
normal  distribution's  disadvantage  of  the  variate  extending  below  zero  to  -«>.  The 
log-normal  distribution  is  often  found  to  be  a  good  fit  to  empirical  data.  Therefore, 
the  log-normal  distribution  is  a  natural  candidate  as  a  time-to-failure  model.  Of 
late,  the  log-normal  distribution  has  gained  popularity  because  of  its  applicability  to 
the  reliability  analysis  of  the  fatigue  life  of  certain  types  of  mechanical  components 
and  to  the  maintainability  analysis  of  time-to-repair  data. 


The  log-normal  distribution  implies  that  the  logarithms  of  the  random  variable 
are  normally  distributed.  The  log-normal  density  function  is: 


f(x)  = 


crx  (2jc) 


1/2 


exp ' 


(In  x-p.)2 

2a2 


0  <  X  <  oo 


0  ,  Elsewhere 


(2-53) 


The  mean  and  standard  deviation  of  the  log-normal  distribution  are  given  by: 


Mean  =  exp 


r  G2^ 

^  +  T 

V  *  J 


(2-54) 


Standard  Deviation  = 


exp  ^2|i  +  2 o2j 


(2-55) 
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where  (i  and  a  are  the  mean  and  standard  deviation  of  the  normal  distribution 
whose  variate  is  the  natural  logarithm  of  the  data. 


The  hazard  function  for  the  log-normal  distribution  involves  the  standard 
normal  probability  density  function,  <|)(x),  and  the  standard  normal  cumulative 
distribution  function,  <I>(x).  The  hazard  function  is  given  by: 


(2-56) 


The  hazard  functions  for  the  log-normal  distribution  quickly  increase  to  a 
maximum  value  then  decrease  relatively  slowly. 

The  log-normal  probability  density  function  and  the  hazard  functions  are  shown 
in  Figure  2.9-3  and  Figure  2.9-4,  respectively,  for  various  choices  of  (i  and  a. 

2.9.5  Type  I  Extreme-Value  or  Gumbel  Distribution 

Three  types  of  extreme-value  distributions  were  developed  by  Gumbel  (1958)  for 
describing  minimum  or  maximum  extreme  values.  Type  II  and  III  Extreme- Value 
distributions  are  not  typically  useful  in  failure  studies.  The  Type  I  Extreme-Value 
distribution  has  found  utility  as  a  failure  model  especially  in  cases  where  failure  is 
due  to  a  corrosive  process. 

In  reliability  work,  the  failure  of  a  component  may  frequently  be  linked  to 
phenomena  which  occurs  at  the  extremes.  In  these  cases  we  are  interested  in  the 
distribution  of  the  minimum  or  maximum  value  in  a  sample  from  some  initial 
distribution  of  the  common  population.  It  can  not  be  assumed  that  the  distribution 
of  the  population  is  necessarily  a  good  model  for  the  extremes.  Extreme-value 
statistics  were  developed  to  describe  these  situations. 
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FIGURE  2.9-4:  HAZARD  FUNCTIONS  FOR  LOG-NORMAL  DISTRIBUTIONS 
WITH  DIFFERENT  PARAMETER  VALUES 

The  Type  I  Extreme-Value  distribution  for  maximum  or  minimum  values  is 
applicable  when  the  underlying  distribution  for  the  extremes  is  of  the  exponential 
type.  These  are  distributions  whose  cumulative  probability  approaches  unity  at  a 
rate  which  is  equal  to  or  greater  than  that  for  the  exponential  distribution.  This 
includes  most  reliability  distributions,  such  as  the  normal,  log-normal  and 
exponential  distributions. 

The  probability  density  function,  f(x),  and  the  hazard  function,  h(x),  for  the  Type  I 
Extreme-Value  distribution  of  maximum  elements  are: 

f(x)  =  ^expj-^  (x-p)  -  exp  [-(l/o)(x-p)]|  (2-57) 

-oo  <  X  <  °°,  -oo<|J.<oo/a>  0 
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h(x)  = 


exp  [-(l/q)  (x-n)] 


o{exp  [exp(-(l  /  a)  (x  -  p))]  - 1} 


(2-58) 


The  parameters  p  and  a  are  the  location  and  the  scale  parameters,  respectively, 
of  the  distribution. 


The  probability  density  function,  f(x),  and  the  hazard  function,  h(x),  for  the  Type  I 
Extreme-Value  distribution  of  minimum  elements  are: 


f(x)  =  ^  exp  (x  -  p)  -  exp  [(1  /  a)  (x  -  p)]| 
-oo  <  x  <  °° ,  -oo<p<«»,  a>0 

h(x)  = 


(2-59) 


(2-60) 


Note  that  the  Type  I  Extreme-Value  probability  density  functions  contain  no 
shape  parameter,  and  thus  there  is  only  a  single  shape.  This  limits  the  versatility  of 
the  Type  I  Extreme-Value  distribution.  Plots  of  the  probability  density  functions  are 
given  in  Figure  2.9-5  when  p  =  5  and  0  =  1.  Graphs  of  the  hazard  functions  are 
given  in  Figure  2.9-6. 


FIGURE  2.9-5:  TYPE  I  EXTREME- VALUE  DISTRIBUTIONS  FOR  SMALLEST 
AND  LARGEST  ELEMENTS  WITH  p  =  5  AND  a  =  1 
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Notice  from  viewing  Figure  2.9-5  that  the  Type  I  Extreme-Value  probability 
density  functions  for  the  largest  and  smallest  elements  are  mirror  images  of  one 
another.  The  distribution  of  maximum  values  is  right  skewed,  and  the  distribution 
of  minimum  values  is  left  skewed.  The  hazard  function  for  the  smallest  element 
increases  exponentially  with  time  and  for  the  largest  element  the  increase  is  at  a 
decreasing  rate  approaching  a  constant  value  asymptotically. 


FIGURE  2.9-6:  HAZARD  FUNCTIONS  FOR  TYPE  I  EXTREME- VALUE 
DISTRIBUTIONS  FOR  SMALLEST  AND  LARGEST  ELEMENTS 

WITH  p  =  5  AND  o  =  l 


Reliability  Analysis  Center  (RAC)  •  201  Mill  St.,  Rome,  NY  13440-6916  •  (315)  337-0900 


48 


Mechanical  Applications  in  Reliability  Engineering  -  NPS 


2.9.6  Distribution  Summary 

This  section  summarizes  the  major  characteristics  of  each  of  the  five  continuous 
statistical  distributions  discussed  in  Section  2.9.  Figure  2.9-7  represents  the  family  of 
statistical  distributions  and  the  relationships  which  exist  among  them.  Table  2.9-2 
summarizes  the  important  mathematical  functions  which  are  associated  with  each 
distribution. 
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TABLE  2.9-2:  A  SUMMARY  OF  IMPORTANT  DISTRIBUTIONS 

USED  IN  RELIABILITY 


DISTRIBUTION  TYPE  EXPONENTIAL 


Range  of  variate  x 


WEIBULL 


NORMAL 


Probability 
Density 
Function,  f(x) 


f(x)  =  X  exp(-Xx)  f(x)  =  J*  (x-xJP-1  f(x)  =  — 


exp  - 


(x  ~  jfl 


Probability 
Density 
Function  Graph 


Hazard  Function, 
h(x) 


Hazard  Function 
Graph 
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TABLE  2.9-2:  A  SUMMARY  OF  IMPORTANT  DISTRIBUTIONS 
USED  IN  RELIABILITY  (CONT'D) 
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3.0  EVALUATING  THE  RELIABILITY  OF  PARTS 

3.1  Mechanisms  of  Mechanical  Failure 

The  study  of  mechanical  reliability  is  based  on  understanding  the  process  of 
mechanical  failure.  The  process  of  mechanical  failure  is  described  by  the  failure 
mechanism.  In  order  for  a  reliability  engineer  to  be  proficient  at  identifying  and 
describing  mechanical  failure  mechanisms,  he  may  have  to  draw  on  experience 
obtained  from  many  different  disciplines  such  as:  fracture  mechanics,  tribology, 
material  science,  physics,  chemistry  or  metallurgy  to  name  just  a  few.  Consider  the 
following:  a  rolling  element  bearing  race  experiences  adhesive  wear  and  surface 
pitting  fatigue  as  a  result  of  loss  of  lubrication  which  results  in  excessive  vibration 
and  noise.  In  this  example,  adhesive  wear  and  surface  pitting  fatigue  are  the  failure 
mechanisms  which  describe  the  process  of  failure.  Loss  of  lubrication  can  be 
identified  as  the  cause  of  failure.  Excessive  vibration  and  noise  can  be  identified  as 
the  mode  of  failure  where  the  mode  of  failure  describes  the  effect  of  failure  on  the 
function  of  a  part.  Example  failure  mode  distributions  are  illustrated  in  Table  3.1-1 
for  eleven  different  device  types.  A  full  complement  of  generic  failure  mode 
distributions  are  presented  in  Reference  [24].  So,  each  mechanical  failure  can  be 
characterized  by:  a  cause,  a  mechanism  (process)  and  a  mode  (relative  consequence) 
of  failure. 

Most  mechanisms  of  mechanical  failure  can  be  categorized  by  one  of  the 
following  mechanical  failure  processes: 

Failure  by:  a)  Distortion 

b)  Fracture 

c)  Wear 

d)  Corrosion 

These  "macro-mechanisms"  represent  the  broadest  class  of  failure  mechanisms. 
Each  macro-mechanism  contains  a  number  of  more  specific  failure  mechanisms, 
some  of  which  are  presented  in  Table  3.1-2. 
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TABLE  3.1-1:  NORMALIZED  FAILURE  MODE  DISTRIBUTIONS 


Device  Type 

Failure  Mode 

Failure  Mode 
Probability  (a) 

Accumulator 

Leaking 

.47 

Seized 

.23 

Worn 

.20 

Contaminated 

.10 

Actuator 

Spurious  Position  Change 

.36 

Binding 

.27 

Leaking 

.22 

Seized 

.15 

Adapter 

Physical  Damage 

.33 

Out  of  Adjustment 

.33 

Leaking 

.33 

Alarm 

False  Indication 

.48 

Failure  to  Operate  on  Demand 

.29 

Spurious  Operation 

.18 

Degraded  Alarm 

.05 

Antenna 

No  Transmission 

.54 

Signal  Leakage 

.21 

Spurious  Transmission 

.25 

Battery,  Lithium 

Degraded  Output 

.78 

Startup  Delay 

.14 

Short 

.06 

Open 

.02 

Battery,  Lead  Acid 

Degraded  Output 

.70 

Short 

.20 

Intermittent  Output 

.10 

Battery,  Rechargeable, 

Degraded  Output 

.72 

Ni-Cd 

No  Output 

.28 

Bearing 

Binding/Sticking 

.50 

Excessive  Play 

.43 

Contaminated 

.07 

Belt 

Excessive  Wear 

.75 

Broken 

.25 

Blower  Assembly 

Bearing  Failure 

.45 

Sensor  Failure 

.16 

Blade  Erosion 

.15 

Out  of  Balance 

.10 

Short  Circuit 

.07 

Switch  Failure 

.07 
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TABLE  3.1-2:  PRIMARY  MECHANISMS  OF  MECHANICAL  FAILURE 


Distortion  Failure 

Wear 

Corrosion 

Buckling 

Yielding 

Creep 

Creep  Buckling 

Warped 

Plastic  Deformation  (Permanent) 
Elastic  Deformation  (Temporary) 

Thermal  Relaxation 

Brinnelling 

Ductile  Fracture 

Brittle  Fracture 

Fatigue  Fracture 

High-Cycle  Fatigue 
Low-Cycle  Fatigue 

Residual  Stress  Fracture 

Embrittlement-Fracture 
Thermal  Fatigue  Fracture 
Torsional  Fatigue 

Fretting  Fatigue 

Abrasive  Wear  (Erosive, 
Grinding,  Gouging) 

Adhesive  Wear  (Galling) 
Subsurface  -  Origin  Fatigue 
Surface-Origin  Fatigue  (Pitting) 
Subcase-Origin  Fatigue 
(Spalling) 

Cavitation 

Fretting  Wear 

Scoring 

Corrosion-Fatigue 

Stress-Corrosion 

Galvanic  Corrosion 

Crevice  Corrosion 

Pitting  Corrosion 
Biological  Corrosion 
Chemical  Attack 
Fretting  Corrosion 

Distortion  failures  are  characterized  by  either  a  permanent  or  temporary  change 
in  either  the  size  or  shape  of  a  part  which  prevents  the  part  from  performing  its 
intended  function.  Since  engineering  materials  have  various  degrees  of  elasticity, 
they  are  expected  to  distort  under  load  or  change  in  temperature.  But  when  the 
magnitude  of  this  distortion  exceeds  certain  limits,  the  integrity  and  function  of  the 
part  can  be  compromised  thus  causing  failure.  Examples  of  distortion  failures 
include:  yielding,  creep  (gradual  distortion)  and  buckling  (compression  instability). 
Each  of  these  examples  describes  the  failure  mechanism  or  failure  process. 

In  the  case  of  fatigue  and  fracture,  wear  and  corrosion,  numerous  references 
such  as  those  on  fracture  mechanics  (Reference  [96])  and  tribology  (Reference  [95]) 
discuss  these  mechanisms  in  greater  detail  than  is  appropriate  here  and  the  reader  is 
referred  to  those  sources  for  more  information. 

3.2  Mechanical  Failure  Theories 

In  this  section  we  will  consider  two  of  the  more  accurate  combined  stress 
theories  of  failure  and  their  importance  in  design  reliability.  The  two  mechanical 
failure  theories  to  be  considered  are  the: 

1)  Maximum  Normal  Stress  Theory 

2)  Distortion  Energy  Theory 
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For  many  design  problems,  it  is  found  that  the  level  of  stress  can  be  utilized  by 
the  designer  to  predict  mechanical  failure  and  is  of  great  importance  in  designing 
safe,  reliable  products. 

Each  of  these  theories  was  proposed  around  the  turn  of  this  century  to  provide 
models  to  predict  failure  at  critical  points  in  mechanical  parts  subjected  to  a  multi- 
axial  state  of  stress.  The  multi-axial  state  of  stress  at  a  point  is  defined  in  Figure  3.2-1 
where  stress  is  the  term  used  to  define  the  magnitude  and  direction  of  the  internal 
forces  per  unit  area  acting  at  a  given  location  on  a  specific  plane.  Figure  3.2-1 
represents  an  infinitesimal  volume  (dx  •  dy  '  dz).  The  stress  represented  by  the 
symbol  a  defines  normal  stress  which  is  perpendicular  to  a  cube  face.  The  stress 
represented  by  the  symbol  t  defines  the  shear  stress  which  is  parallel  to  the  face  of 
the  cube.  In  general,  the  triaxial  state  of  stress  at  a  point  can  be  defined  by  the  nine 
components  of  stress  as  indicated  in  Figure  3.2-1.  The  nine  components  of  stress  are: 

^xy '  ^yx '  ^xz '  ^zx /  ^yz/  ^zy /  Gz. 


FIGURE  3.2-1:  COMPLETE  DEFINITION  OF  THE  STATE 
OF  STRESS  AT  A  POINT 
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For  each  triaxial  stress  state  there  exists  an  orientation  of  the  infinitesimal 
volume  (dx,  dy,  dz)  where  all  the  shear  stress  are  zero  and  the  normal  stresses  are  a 
maximum.  These  maximum  normal  stresses  are  called  principal  stresses  and  are 
designated  by  ci,-c2  and  c3.  These  maximum  normal  or  principal  stresses  will  be 

utilized  in  our  first  theory. 

3.2.1  The  Maximum  Normal  Stress  Theory 

The  maximum  normal  stress  theory  was  first  proposed  by  Rankine  and  is  also 
referred  to  as  Rankine’s  theory  of  failure.  Rankine  formulated  his  theory  by 
proposing  that:  "failure  is  predicted  to  occur  in  the  multi-axial  state  of  stress  when 
the  maximum  principal  normal  stress  becomes  equal  to  or  exceeds  the  maximum 
normal  stress  at  the  time  of  failure  in  a  simple  uniaxial  stress  test  using  a  specimen 
of  the  same  material."  Rankine's  theory  can  be  represented  mathematically  as 
follows: 

Failure  is  predicted  when: 


Oj  >  ot  Oj  <  oc 

c2  S  ct  or  cr2  <  cc  (3-1) 

o3  >  ct  03  <  cc 

(Note:  Recall  that  compressive  stress  is  negative  and  tensile  stress  is  positive) 
where, 

cj,  c2,  c 3  =  maximum  normal  stresses 

Cj  =  tensile  yield  strength  for  ductile  materials  or  tensile  fracture 
strength  for  brittle  materials 

oc  =  compressive  yield  strength  for  ductile  materials  or  compressive 
fracture  strength  for  brittle  material 


The  utility  of  the  maximum  normal  stress  theory  is  optimum  in  predicting  the 
failure  of  primarily  brittle  materials  such  as  cast  iron.  For  materials  that  behave  in  a 
ductile  fashion,  a  better  choice  of  failure  theories  would  be  the  distortion  energy 
theory. 
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3.2.2  Distortion  Energy  Theory 

The  distortion  energy  theory  was  first  proposed  as  a  failure  theory  in  1904  by 
Huber  and  was  later  improved  by  Hencky  and  VonMises.  The  distortion  energy 
theory  of  predicting  failure  was  based  on  the  postulation  that  the  total  strain  energy 
per  unit  volume,  u,  was  composed  of  two  parts;  the  energy  of  distortion,  ud,  which 
was  the  energy  associated  solely  with  the  change  in  shape,  and  the  energy  of 
volume,  uv,  which  was  the  energy  associated  solely  with  the  change  in  volume. 
Thus,  the  following  relationship  was  proposed: 

u  =  uv  +  ud 

With  these  concepts  in  mind,  the  distortion  energy  theory  can  be  stated  as 
follows: 

"Failure  is  predicted  to  occur  in  the  multi-axial  state  of  stress  when  the 
distortion  energy  per  unit  volume,  ud,  becomes  equal  to  or  exceeds  the 
distortion  energy  per  unit  volume  at  the  time  of  failure  in  a  simple  uniaxial 
stress  test  using  a  specimen  of  the  same  material." 

The  final  mathematical  expression  for  the  distortion  energy  theory  involves  the 
derivation  of  the  energy  of  volume,  uv,  and  the  total  strain  energy  per  unit 
volume,  u,  from  which  the  distortion  energy,  ud,  was  derived.  A  detailed 
derivation  of  the  distortion  energy  is  presented  in  Reference  [93].  The  final 
mathematical  statement  of  the  distortion  energy  theory  of  failure  is  provided  in 
Equation  (3-2).  Failure  is  predicted  to  occur  when: 

—  -  o2)2  +  (o2  -  g3)2  +  (o3  -  G|)2J  >  of  (3-2) 

where. 


Of,  o2,  o3  =  maximum  normal  stresses 
Of  =  uniaxial  yield  stress 

The  application  of  the  distortion  energy  theory  for  predicting  failure  has  proven 
most  successful  when  applied  to  ductile  materials. 
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3.3  Guide  to  the  Application  of  Part  Reliability  Prediction  Techniques 

The  discipline  of  mechanical  reliability  and  reliability  prediction  of  mechanical 
parts  in  particular  is  less  standardized  than  corresponding  techniques  for  electronics. 
Whether  this  is  good  or  bad  is  debatable.  The  lack  of  well  established  techniques  has 
led  to  innovative  approaches  which  address  process  and  design  variability,  irregular 
loading  patterns  and  the  effects  of  maintenance.  These  approaches  to  mechanical 
reliability  prediction  can  serve  as  better  tools  to  evaluate  design  integrity.  The  intent 
of  this  section  is  to  provide  guidance  to  practicing  reliability  engineers  and  to  initiate 
further  dialog  concerning  the  merits  of  available  prediction  techniques  and 
approaches  associated  with  mechanical  reliability. 

Documents  such  as  RADC-TR-83-85,  "Reliability  Programs  for  Nonelectronic 
Designs"  and  RADC-TR-85-194,  "RADC  Nonelectronic  Reliability  Notebook,"  are 
good  sources  of  information  for  mechanical  reliability  information  and  guidance. 
However,  little  documented  guidance  exists  which  specifically  discusses  procedures 
for  mechanical  part  reliability  prediction.  This  problem  needs  to  be  addressed 
because  quantitative  reliability  prediction  is  often  imposed  as  a  contractual 
requirement  and  the  process  of  reliability  prediction,  if  handled  properly,  yields 
useful  information  for  design  tradeoff  decisions. 

To  provide  some  general  guidance  to  practicing  reliability  engineers,  a 
prioritized  list  of  reliability  prediction  techniques  for  mechanical  components  is 
presented  in  Figure  3.3-1.  At  the  top  of  the  pyramid  is  the  analysis  of  test  or 
historical  failure  data  as  the  most  desirable  approach,  and  at  the  bottom  is  the  use  of 
surrogate  data  sources.  Other  approaches  include  empirical  reliability  models  and 
stress/ strength  interference  analysis. 

These  prediction  techniques  are  part  of  the  mechanical  part  prediction  process. 
Figure  3.3-2  outlines  the  flow  of  a  typical  mechanical  part  reliability  prediction 
process.  A  critical  first  step  in  this  process  is  to  identify,  locate  and  obtain  the  major 
information  required  for  the  reliability  assessment.  Information  that  is  not 
available  to  the  engineer/ analyst  will  result  in  assumptions  during  the  assessment 
which  may  or  may  not  be  valid.  The  amount  of  information  available  will  largely 
depend  on  the  stage  of  system  development  (conception,  design,  production  or 
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service).  The  engineering  documentation  identified  in  Figure  3.3-2  provides  the 
analyst  with  specific  information  as  detailed  in  Figure  3.3-3. 


Part  Failure' 
Data  Analysis ' 


Empirical  Reliability^ 
Relationships 


Stress/Strength 
Interference  Analysis 


Surrogate  Data  Sources 


FIGURE  3.3-1:  HIERARCHY  OF  RELIABILITY  PREDICTION 
TECHNIQUES  FOR  MECHANICAL  PARTS 

The  next  stage  in  the  procedure  is  to  select  and  apply  the  appropriate  part 
reliability  prediction  technique(s).  Some  of  the  more  popular  prediction  techniques 
are  summarized  below: 

Part  Failure  Data  Analysis 

Analysis  of  failure  data  is  the  preferred  approach.  Accurate  failure  data  can  exist 
as  part  of  a  historical  database  for  systems  or  equipments  which  an  organization 
produces,  operates  or  manages.  Alternatively,  data  may  exist  as  a  product  of  a 
dedicated  testing  program  designed  to  understand  and/or  measure  the  reliability  of 
a  new  system  or  component.  When  data  of  this  nature  is  available,  the  underlying 
time-to-failure  distribution  should  be  determined.  Use  of  the  Weibull  distribution 
has  proven  to  be  particularly  effective  to  characterize  the  time-to-failure  tendencies 
for  mechanical  parts.  It  is  also  necessary  to  analyze  the  failed  parts  and  resulting 
data  in  detail  to  identify  trends  and  to  investigate  failure  mechanisms.  In  this 
manner,  the  reliability  engineer  can  critically  evaluate  the  design  integrity  and  work 
together  with  design  engineering  to  improve  the  design.  Refer  to  Section  3.4  for 
more  detail  regarding  the  analysis  of  part  failure  data. 
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Obtain  System  Documentation: 

(1) 

Assembly  Drawings 

(2) 

Part  Drawings 

(3) 

Bill  of  Material 

(4) 

Environmental  Profiles 

(5) 

Operating  Conditions 

Select  One  Relevant 
Part  for  Analysis 


No 


Analyze  Time-To-Failure 
Data  Using  Statistical 
Method  of  Section  3.4 

r 

Calculate  Part  Reliability 

Select  the  Appropriate 
Part  Reliability  Prediction 
Method  (Refer  to  Figure  3.3-1)  | 


No 


Perform  or  Acquire 
Force/Stress  Analysis 


Part(s)  Summary 


FIGURE  3.3-2:  MECHANICAL  PART  RELIABILITY  PREDICTION  PROCEDURE 
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(1)  Assembly  Drawings 

•  An  Overview  of  System  Operation 

•  Part  Function 

•  Part  Interaction 

(2)  Part  Drawings 

•  Dimensions 

•  Material 

•  Part  Description 

•  Part  Numbers 

(3)  Bill  of  Material 


•  A  Complete  Parts  Checklist  of  the  System 

(4)  Environmental  Profiles 


•  Temperature 

•  Humidity 

•  Shock,  Vibration 

•  Atmosphere  Contaminants 

(5)  Operating  Conditions 

•  Loads 

•  Speeds 

•  Duty  Cycle 

•  Lubrication 


FIGURE  3.3-3:  SPECIFIC  INFORMATION  GAINED  FROM  DOCUMENTATION 
Empirical  Reliability  Models 


Empirical  models  are  generally  based  on  extensive  testing  for  different 
combinations  of  loading,  materials,  dimensions  and  other  physical  properties.  The 
models  can  provide  the  framework  for  reliability  predictions.  Tools  required  to  use 
these  models  include  the  ability  to  determine  some  measure  of  part  life  (e.g.,  Ljq  or 
L50  life)  and  the  ability  to  determine  Weibull  characteristic  life  based  on  part  life. 

Empirical  models  often  involve  computation  of  an  L10  life,  that  is,  the  time  at 
which  10%  of  the  population  will  fail.  If  a  Weibull  distribution  has  been 
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determined  to  be  appropriate,  then  the  Weibull  scale  parameter  or  characteristic  life 
(a)  can  be  derived  using  Equation  (2-43)  and  substituting  F(x)  =  .10  and  x  = 

which  yields: 


a  =  L10 


(3-3) 


where, 

a  =  Weibull  characteristic  life 

L10  =  time  at  which  10%  have  failed  (found  using  the  empirical  model) 

P  =  Weibull  shape  parameter 


The  Weibull  hazard  rate  is  then  given  by. 


h(x)  = 


PxM 

ccP 


(3-4) 


It  has  been  found  that  for  general  classes  of  mechanical  components,  the 
Weibull  shape  parameter  remains  approximately  constant  while  the  characteristic 
life  varies  with  application  stresses,  design  tolerances,  etc.  Typical  shape  parameters 
((3)  have  been  found  to  be  2.5  for  gears  and  1.5  for  tapered  roller  bearings. 


The  use  of  empirical  reliability  models  will  be  discussed  in  more  detail  in 
Section  3.5  of  this  document. 


Stress /Strength  Interference  Analysis 

Stress /strength  interference  analysis  involves  the  characterization  of  statistical 
distributions  for  the  stress  acting  on  a  mechanical  part  and  material  strength. 
Historically,  stress  and  strength  have  been  treated  as  deterministic  values  in  the 
mechanical  design  process.  The  most  positive  benefit  of  applying  stress /strength 
interference  analysis  is  the  widespread  realization  that  stress  and  strength  are  not 
deterministic  values  but  are  subject  to  variability.  By  understanding  and  modeling 
this  variability,  the  mechanical  design  process  can  be  improved  and  made  more 
efficient. 
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Figure  3.3-4  illustrates  the  concept  of  stress/strength  interference  theory.  In 
approximate  terms,  the  interference  or  intersection  between  the  two  distributions 
represents  the  probability  of  failure.  In  real  terms,  the  probability  of  failure  is 
somewhat  less  than  the  interference.  One  problem  with  stress /strength  interference 
theory  is  that  the  intersection  between  the  two  distributions  can  extend  far  out  to  the 
distribution  tails.  Therefore,  if  an  incorrect  underlying  distribution  was  selected  or 
if  the  variability  was  not  accurately  characterized,  then  the  resulting  probability  of 
failure  may  be  significantly  in  error. 


FIGURE  3.3-4:  STRESS/ STRENGTH  INTERFERENCE  THEORY 

Stress /Strength  interference  analysis  will  be  discussed  in  more  detail  in  Section 
3.6. 


Surrogate  Data  Sources 

Generic  failure  rate  data  is  available  from  sources  such  as  RAC  Publication 
NPRD-91,  "Nonelectronic  Parts  Reliability  Data"  (Reference  [30])  and  the  "RADC 
Nonelectronic  Notebook"  (Reference  [48]).  Data  is  generally  grouped  in  the  form  of 
'N'  failures  in  'Y'  part  hours  allowing  the  computation  of  an  average  failure  rate. 
Failure  rates  from  these  surrogate  data  sources  are  the  least  desirable  method  of 
predicting  mechanical  component  reliability.  The  average  failure  rates  may  not 
correspond  to  your  particular  application  and  do  not  account  for  the  possibility  of  a 
time  dependent  hazard  rate.  Surrogate  failure  data  sources  usually  assume  the 
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exponential  distribution  to  be  the  representative  failure  model.  This  assumption 
allows  a  constant  hazard  rate  to  be  calculated.  Surrogate  data  sources  will  be 
discussed  in  more  detail  in  Section  3.7. 

Comparison 

Each  of  the  reliability  prediction  tools  described  in  this  Section  has  merits 
depending  on  the  particular  application,  the  availability  of  the  required  data  and  the 
objective  of  the  reliability  prediction  process.  Table  3.3-1  provides  a  list  of  the 
relative  advantages  and  disadvantages  of  the  various  techniques.  Sections  3.4 
through  3.7  will  discuss  each  of  these  part  reliability  prediction  techniques  in  detail. 


TABLE  3.3-1:  COMPARISON  OF  RELIABILITY  PREDICTION  TECHNIQUES 


Technique 

Advantages 

Disadvantages 

Failure  Data  Analysis 

•  Corresponds  to  actual  or  simulated 
loading  conditions 

•  Hazard  rate  time  dependency  can  be 
analyzed 

•  Comparison  and  analysis  of  data  can 
identify  design  deficiencies  and 
improvements 

•  Data  often  not  available 

•  Even  when  available,  data  is  often  grouped 
(i.e.,  individual  time-to-failures  not 
available) 

•  If  design  is  completely  new,  a  dedicated 
testing  program  is  required  which  may  be 
expensive 

Empirical  Reliability 
Relationships 

•  Takes  advantage  of  extensive  test 
results 

•  Irregular  loading  patterns  can  be 
accommodated 

•  Models  are  available  only  for  a  few  part 
types 

•  New  processes  or  materials  cannot  be 
accommodated 

•  Models  often  are  for  L10  life  and  not 
hazard  rate 

Stress /Strength 
Interference  Theory 

•  Addresses  variability  of  stress  and 
material  strength 

•  Quantitative  estimates  of  reliability  are 
available 

•  Result  is  presented  as  a  probability  of 
failure  instead  of  a  hazard  rate 

•  Interference  is  often  at  the  extremes  of  the 
distribution  tails 

•  Standard  deviation  for  stress  is  not  always 
available 

Surrogate  Data 

Sources 

•  Quick  and  inexpensive 

•  Effective  for  non-critical  or  low  failure 
rate  components 

•  Easy  to  combine  with  electronic 
predictions 

•  Constant  failure  rates  assumed 

•  Failure  rates  are  not  application  sensitive 
and  have  limited  accuracy 

•  Doubtful  that  design  improvements  will 
result  from  the  prediction  process 
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3.4  Evaluating  Part  Reliability  Using  Part  Failure  Data 
3.4.1  Collecting  Part  Failure  Data 


Time-to-failure  (TTF)  data  for  parts  can  be  obtained  in  a  number  of  different 
ways.  For  example,  it  can  be  collected  directly  from  the  life  testing  of  many  identical 
parts,  or  it  can  be  extracted  from  the  failure  process  of  one  or  many  systems  which 
contain  the  part(s)  of  interest.  However  TTF  data  is  collected,  the  actual  operating 
time  of  the  part  should  always  be  specified  independent  from  calendar  time.  Figure 
3.4-1  illustrates  the  insignificance  of  a  calendar  time  reference;  the  calendar  time 
interval  may  not  be  proportional  to  operating  TTF. 


1  Part  #1 

N 

C  TTFj  =  20  hrs 

C  TTF2  =  24  hrs 

C  TTF3  =  24  hrs 

| 

CJ 

/ 

/ 

s  Part  #2  v 

V 

)  ? 

\ 

u 

Calendar  Time 

4/16/91  4/17/91 


4/18/91 


24  Hour  Interval 


FIGURE  3.4-1:  INSIGNIFICANCE  OF  CALENDAR  TIME  AS  A 
MEASURE  OF  PART  TIME-TO-FAILURE  DATA 


All  part  reliability  numerics  are  generated  from  the  collection  and  analysis  of 
TTF  data.  Yet,  even  collecting  this  one  variable  can  be  an  endless  task  given  the 
infinite  number  of  part  types  and  different  applications  of  all  mechanical  parts 
which  exist  today.  Since  this  information  cannot  generally  be  located  in  a  reference 
source,  many  engineers  are  faced  with  performing  life  tests  and  then  evaluating  the 
resulting  failure  data.  This  section  summarizes  those  procedures  required  to 
successfully  evaluate  resulting  failure  data. 
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3.4.2  Order  Statistics  and  Ranking  of  Part  Failure  Data 

After  time-to-failure  (TTF)  data  has  been  collected  on  a  sample  of  identical, 
independent  parts,  the  next  step  is  to  determine  the  appropriate  failure  model 
which  is  representative  of  the  data.  This  is  accomplished  utilizing  order  statistics, 
ranking,  probability  plotting  and  distribution  fitting  techniques  as  illustrated  in 
Figure  3.4-2. 


FIGURE  3.4-2:  PROCEDURE  FOR  EVALUATING  PART  TTF  DATA 


Order  statistics  implies  that  the  part  times-to-failure  are  arranged  from 
minimum  to  maximum  and  any  reference  to  calendar  time  is  removed  and 
replaced  with  actual  operating  time.  Since  the  chronological  ordering  of  part  TTF 
data  is  not  significant,  no  information  is  lost  and  valid  statistics  can  be  obtained  after 
a  reordering.  This  same  characteristic  does  not  apply  to  time  between  failure  (TBF) 
data  for  a  repairable  system.  Figure  3.4-3  illustrates  the  application  of  order  statistics 
to  a  sample  TTF  data  set. 


Original  Sample  Data  Set 


Part 

Number 

Life  to  Failure 
(105  cycles) 

1 

TTF!  =  6.6 

2 

TTF2  =  1.3 

3 

TTF3  =  4.0 

4 

TTF4  =  2.7 

5 

TTF5  =  5.2 

6 

TTF6  =  9.8 

Part 

Number 

Life  to  Failure 
(105  cycles) 

2 

TTF(d  =1.3 

4 

TTF(2)  =  2.7 

3 

1 

W 

II 

O 

5 

TTF(4)  =  5.2 

1 

TTF(5)  =  6.6 

6 

TTF(6)  =  9.8 

FIGURE  3.4-3:  APPLICATION  OF  ORDER  STATISTICS  FOR  PART  TTF  DATA 


The  next  step  in  evaluating  TTF  data  is  to  perform  ranking  of  the  order  statistics. 
The  purpose  of  ranking  is  to  determine  the  cumulative  distribution  representing 
the  entire  population  of  parts  from  a  limited  sample  size.  In  order  to  do  this,  the 
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median  rank  is  used.  Sample  ranking  is  typically  used  when  the  sample  size  is 
small  (e.g.,  1-60).  For  larger  sample  sizes,  the  proportionate  cumulative  frequency  is 
calculated  directly. 

Median  rank  tables  can  be  found  in  most  comprehensive  statistics  or  reliability 
text  books.  The  median  rank  can  also  be  calculated  directly  using  Benard's  formula: 

Median  Rank  =  -1 — —  (3-5) 

n  +  .4 

where, 

j  =  failure  order  number  (order  statistics  applied) 
n  =  sample  size 

It  is  important  to  note  that  n,  the  sample  size,  represents  the  total  number  of 
parts  on  test,  not  just  the  failures.  Therefore,  if  a  test  is  truncated  prior  to  failure  of 
all  test  specimens,  the  total  number  of  parts  on  test  is  used  in  Bernard's  formula. 


The  median  ranks  are  calculated  for  the  data  set  shown  in  Figure  3.4-3,  using 
Bernard's  relationship,  and  provided  in  Figure  3.4-4. 


Life  to 

Life  to 

Part 

Failure 

Part 

Failure 

Failure 

Part 

Life  to 

Median 

# 

# 

Order  # 

# 

Failure 

Rank  % 

1 

6.6 

H 

1.3 

_ k. 

mm 

2 

1.3 

10.91 

2 

1.3 

Apply 

B 

2.7 

Apply  ^ 

Em 

4 

2.7 

26.55 

3 

4.0 

Order 

m-M 

4.0 

Median 

3 

4.0 

42.18 

4 

2.7 

Statistics 

A 

11 

5.2 

Ranks  A 

SI 

5 

5.2 

57.82 

5 

5.2 
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6.6 

73.45 

6 

9.8 

6 

9.8 

mm 

9.8 

89.09 

FIGURE  3.4-4:  APPLICATION  OF  MEDIAN  RANKS  TO  ORDER  STATISTICS 


3.4.3  Preparing  Part  Failure  Data  With  Suspended  Data 


Data  suspensions  often  occur  when  a  test  is  terminated  prior  to  failure  of  all  test 
items,  or  in  the  instance  where  a  failure  is  inadvertently  induced  by  improper 
handling  or  other  similar  conditions. 
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In  the  special  case  when  parts  are  suspended  or  have  not  failed,  the  failure  order 
number  increment  (default  =  1  for  no  suspension)  can  be  modified  for  all  failures 
following  the  suspended  part  using  the  following  equation: 


New  Failure  Order  Increment 


(n  +  1)  -  previous  failure  order  number _ 

1  +  number  of  items  following  present  suspended  set 


(3-6) 


Once  the  new  failure  order  number  is  calculated,  Benard's  formula  can  be 
applied  to  calculate  the  median  rank.  This  increment  is  used  on  all  failures 
following  a  suspended  item  until  another  suspended  item  is  reached. 


3.4.4  Weibull  Analysis  of  Part  Failure  Data 


After  the  median  ranks  have  been  established,  distribution  plotting  and  fitting 
techniques  are  applied.  A  number  of  statistical  distributions  are  available  as 
potential  failure  models.  These  failure  distributions  include: 


•  Weibull 

•  Normal 

•  Log-normal 

•  Exponential 


This  is  by  no  means  a  complete  list  of  possible  failure  distributions,  but  does 
include  the  more  popular  choices  among  part  data  analysts.  For  our  purposes,  the 
Weibull  distribution  is  selected  because  it  represents  a  family  of  distributions  and 
can  be  used  to  represent  or  approximate  other  distributions  such  as  the  normal  or 
exponential.  The  Weibull  distribution  is  currently  the  most  frequently  utilized 
initial  failure  model  for  evaluating  TTF  data  from  mechanical  parts.  The  probability 
density  function,  f(x),  for  the  Weibull  distribution  is: 


f(x)  = 


I 

a 


x  -  x 


o 


xP-1 


a 


exp 


(3-7) 


(Note:  Other  mathematically  equivalent  forms  are  available.) 
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where, 

P  =  Weibull  slope  or  shape  parameter 

a  =  Characteristic  value  or  scale  parameter  (F(a)  =  63.2%) 

x0  =  Location  parameter  (expected  minimum  value) 

Special  probability  paper  has  been  developed  and  is  commercially  available 
(Team,  Chartwell)3  which  can  be  used  for  Weibull  probability  plotting.  Also, 
commercial  statistical  packages  are  available  which  automate  the  entire  data 
evaluation  process,  although  utilizing  such  packages  before  understanding  the 
theory  of  the  techniques  can  be  hazardous.  Figure  3.4-5A  shows  the  time  to  failure 
data  and  associated  median  ranks  to  be  plotted  which  were  derived  in  Section  3.4.2 
and  shown  in  Figure  3.4-4.  These  coordinates  are  plotted  on  Weibull  probability 
paper  and  the  best  fit  straight  line  is  drawn  through  the  data  points  as  shown  in 
Figure  3.4-5A.  If  the  data  fits  a  straight  line  on  Weibull  probability  paper,  then  the 
part  TTF  data  is  Weibull  distributed. 

If  a  straight  line  can  not  be  reasonably  fit  through  the  data  points,  then  the  part 
TTF  data  is  not  Weibull  distributed  or  several  distinct  failure  mechanisms  are 
mixed  together  or  the  location  parameter  (xG)  is  not  zero.  Once  the  line  is 

constructed  on  Weibull  paper.  The  Weibull  slope,  P,  can  be  determined  from  the 
scales  provided  on  the  paper.  The  methods  for  doing  this  vary  depending  on  what 
paper  is  used.  Most  construct  either  a  parallel  line  as  shown  in  Figure  3.4-5A  or  a 
perpendicular  line  as  shown  in  Figure  3.4-5B  to  the  fitted  line  through  a  specified 
origin  point.  Each  method  will  be  self  explanatory  by  viewing  the  paper. 

The  characteristic  value,  a,  is  determined  from  the  fitted  line.  The  characteristic 
value  is  found  by  constructing  a  horizontal  line  through  the  ordinate  (percent 
failure)  at  63.2%.  Then  from  the  point  where  the  horizontal  intersects  the  fitted 
line,  drop  a  vertical  line  to  the  time-to-failure  (TTF)  axis.  The  resulting  value  of 
time-to-failure  is  the  characteristic  value,  a.  In  our  case,  the  data  is  Weibull 
distributed  with  the  following  parameters: 

P  =  Weibull  slope  =  1.5 

a  =  Characteristic  value  =  5.8  x  105  cycles  (3-8) 

3  -  Team  Graph  Papers,  Box  25,  Tamworth,  N.H.  03886;  phone:  603-323-8843 

Chartwell  Technical  Papers,  H.W.  Peel  &  Co.,  Jeymer  Drive,  Greenford,  Middlesex,  England; 
phone:  01-578-6861 
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FIGURE  3.4-5A:  WEIBULL  PROBABILITY  PLOTTING  ON 
WEIBULL  PROBABILITY  PAPER 
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AGE  AT  FAILURE 

FIGURE  3.4-5B:  WEIBULL  PLOTTING  ON  CHARTWELL  WEIBULL  PAPER 
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The  Weibull  mean  value  (jiw)  for  data  which  is  Weibull  distributed  can  be 
calculated  using  the  following  relationship: 

nw  =  «r(i  +  (1/M)  (3-9> 

where, 

pw  =  Weibull  mean  value 
a  =  Weibull  characteristic  value 
P  =  Weibull  shape  parameter 

T(x)  =  Gamma  function,  evaluated  at  x  (Refer  to  Appendix  E) 

The  mean  value  is  significant  because  it  is  the  most  common  measure  of  central 
tendency  and  is  useful  in  characterizing  the  distribution  of  failure.  The  Weibull 
mean  can  also  be  determined  directly  from  the  Weibull  plot  as  shown  in  Figure  3.4- 
5B  but  the  analyst  must  realize  that  the  percentile  used  to  estimate  mean  life  is  a 
function  of  the  Weibull  shape  parameter,  p.  Table  3.4-1  provides  a  list  of  percentiles 
which  may  be  used  to  evaluate  the  Weibull  mean.  Table  3.4-1  was  derived  by 
substituting  Equation  (3-9)  into  Equation  (2-43)  given  x  =  pw  then  solving  for 

F(M-w)* 


TABLE  3.4-1:  RELATION  BETWEEN  p  AND  WEIBULL  MEAN  LIFE 


p 

Percentile  Used  to  Estimate 
Mean  Life  (see  note  below) 

0.5 

75% 

1.0 

63.2% 

1.5 

57.5% 

2.0 

54.5% 

2.5 

52.5% 

3.0 

51% 

3.44 

50% 

Note:  F(pw)  =  1  -  exp  |-[r(l  +  (l/p))f| 
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If  we  compare  the  percentile  indicated  in  Figure  3.4-5B  which  is  based  on  a 
Weibull  shape  parameter  of  1.5  to  the  percentile  indicated  in  Table  3.4-1  for  (3  =  1.5, 
it  is  concluded  that  they  are  approximately  equal  (57.5%). 

Utilizing  Equation  (3-9),  the  mean  life  can  be  calculated  for  the  data  given  in 
Figure  3.4-5A  and  Figure  3.4-5B  as  follows: 

pw  =  aT(l  +  (1/p)) 

pw  =  (5.8xl05)r(l  +  (1/1.5)) 

pw  =  (5.8xl05)r(1.67) 

|iw  =  5.24  x  105  cycles 


Notice  the  value  calculated  above  corresponds  to  life  at  57.5%  probability  of 
failure  in  Figure  3.4-5B. 


The  final  step  is  to  substitute  the  Weibull  parameters  derived  from  Figure  3.4- 
5A  or  3.4-5B  back  into  the  probability  density  function,  f(x);  the  reliability  function, 
R(x);  and  the  hazard  function,  h(x).  Substitution  of  P  =  1.5  and  a  =  5.8  x  105  into 
Equation  (2-43)  yields: 


f(x)  = 


1.5  x 


.5 


(5.8  x  105) 


15  exP 


vl.5 


,5.8  xlO5. 


(3-10) 


Substitution  of  P  =  1.5  and  a  =  5.8  x  105  into  Equation  (2-45)  yields: 

\1.5 

II  V 

R(x)  =  exp 


,5.8  xlO5. 


(3-11) 


Substitution  of  P  =  1.5  and  a  =  5.8  x  105  into  Equation  (2-46)  yields: 


h(x)  = 


1.5 

5.8  x  105 


(3-12) 
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With  these  functions,  we  have  completed  the  process  shown  in  Figure  3.4-2. 
The  determination  of  these  functions  is  sufficient  to  define  the  reliability 
characteristics  of  the  original  data  set  presented  in  Figure  3.4-3. 

3.4.5  Weibull  Paper  Defined 


Since  the  use  of  Weibull  probability  paper  is  of  major  importance  in  the 
evaluation  of  part  time-to-failure  data,  it  is  derived  as  follows.  Recall  that  the  two 
parameter  Weibull  cumulative  distribution  function  F(x)  is: 


F(x)  =  1  -  exp 


(3-13) 


Equation  (3-13)  can  be  rewritten  as: 


1  -  F(x) 


=  exp 


f-f 

VaJ 


Taking  natural  logarithm  of  Equation  (3-14): 


In - 3—  =  (if 

1  -  F(x)  UJ 

Taking  the  natural  logarithm  once  again: 

In  In  - — =  p  (In  x)  -  (p  In  a) 

1  -  F(x) 

Equation  (3-16)  has  the  form  Y  =  mX  +  b 
where, 

Y 

X 
m 
b 


=  lnln - i— 

1  -  F(x) 

=  In  x 

=  P 

=  -pin  a 


(3-14) 


(3-15) 


(3-16) 
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Equation  (3-16),  therefore,  represents  a  straight  line  with  a  slope  of  P  and  an 
intercept,  b,  on  the  Cartesian  X,  Y  coordinates.  Hence,  Weibull  probability  paper  is  a 
plot  of: 


Y 

X 


lnln 


1 

1  -  F(x) 


In  x 


Chartwell  Weibull  probability  paper  available  from  H.W.  Peel  &  Company  Ltd. 
(Middlesex,  England)  is  shown  in  Figure  3.4-6.  An  advantage  of  utilizing  Chartwell 
Weibull  paper  is  that  it  contains  a  scale  where  the  cumulative  percentage  associated 
with  the  mean  can  be  read  directly. 

3.5  Failure  Rate  Models  and  Life  Assessment  Models  Used  for  Evaluating 
Part  Reliability 

3.5.1  Base  Failure  Rate  With  Adjustment  Factors  Models 

The  base  failure  rate  with  adjustment  factors  model  follows  the  generalized 
form: 

i  =  ibil  * i  0-17) 

r  i=l 

where, 

A.p  =  predicted  part  failure  rate 
=  base  failure  rate 
Ttj  =  adjustment  factors 
n  =  number  of  required  adjustment  factors 

These  are  not  statistical  models;  rather  engineering  approximations. 
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AGE  AT  FAILURE 

FIGURE  3.4-6:  WEIBULL  PROBABILITY  PAPER 
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The  following  list  of  terms  are  typically  required  when  utilizing  the  model  of 
Equation  (3-17)  to  predict  the  failure  rates: 

•  Part  environment 

•  Part  construction 

•  Part  operating  conditions 

•  Part  quality  level 


Currently,  only  a  limited  number  of  models  of  this  type  are  available  for 
predicting  the  constant  failure  rate  of  mechanical  and  electromechanical  devices. 
The  two  main  sources  containing  developed  models  are  References  [43]  and  [46]. 

The  base  failure  rate  with  adjustment  factors  method  for  predicting  part 
reliability  is  a  well-established  method  and  the  calculations  are  simple  to  apply. 
Both  of  these  features  are  advantages  when  using  this  method.  The  disadvantages 
of  applying  this  method  stem  from  the  fact  that  only  a  few  models  are  currently 
available  and  they  tend  to  disregard  the  time  dependent  nature  of  mechanical  part 
hazard  rates. 


MIL-HDBK-217F  "Reliability  Prediction  of  Electronic  Equipment"  (Version  F 
discussed  here)  is  a  military  handbook  that  uses  the  model  of  Equation  (3-17).  MIL- 
HDBK-217's  primary  concern  is  electronic  devices  but  some  electromechanical 
devices  are  also  considered,  such  as: 


•  Synchros  and  resolvers 

•  Elapsed  time  meters 

•  Relays 

•  Switches 


•  Connectors 

•  Transformers 

•  Coils 

•  Lasers 


MIL-HDBK-217  provides  the  necessary  tables  to  identify  the  correct  base  failure 
rate  (X^)  and  adjustment  factors  (7q)  to  calculate  the  predicted  failure  rate  given 
specific  physical  characteristic  and  operational  conditions  of  the  particular  device. 

MIL-HDBK-217  has  also  adopted  a  failure  rate  prediction  model  for  fractional 
horsepower  motors  utilizing  rolling  element  grease  packed  bearings.  This  model 
only  accounts  for  bearing  and  winding  failures.  A  large  body  of  time-to-failure  data 
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was  available  from  which  this  model  was  developed.  Bearing  and  winding  failures 
accounted  for  better  than  80%  of  all  failures.  Attempts  to  include  other  causes  in  the 
model  failed  to  produce  significant  improvements.  Application  of  the  model  to 
D.C.  brush  motors  assumes  that  the  brushes  are  inspected  and  replaced  and  are  not  a 
failure  item.  The  model  was  developed  at  Shaker  Research  Corporation  by  D.S. 
Wilson  and  R.  Smith  and  is  summarized  in  the  technical  report:  RADC-TR-77-408, 
entitled  Electronic  Motor  Reliability  Model.  The  model  adopted  by  MIL-HDBK-217 
is  a  simplified  version  of  a  more  complex  model.  The  simplified  failure  rate  model 
is  given  as: 


Xp  = 


f  2 

X 

-g-  +  — 

VaB  a 


w 


x  106  (failures  / 106  hours) 


(3-18) 


where, 

A,p  =  average  failure  rate  (failures/ 106  hours) 

x  =  motor  operating  time  period,  selected  by  the  user,  for  which  average 
failure  rate  is  calculated  (hours).  Each  motor  must  be  replaced  when  it 
.  reaches  the  end  of  this  operating  period  to  make  the  calculated  ^p  valid. 

ocb  =  bearing  Weibull  characteristic  life  value 
aw  =  winding  Weibull  characteristic  life  value 


The  bearing  and  winding  Weibull  characteristic  life  values  are  determined  from 
tables  contained  in  MIL-HDBK-217  and  are  based  on  the  ambient  temperature 
surrounding  each  part.  Notice  that  Equation  (3-18)  does  not  follow  the  form  of 
Equation  (3-17).  The  form  of  Equation  (3-18)  will  be  discussed  further  in  Section 
3.5.2. 


The  following  example  shows  the  procedural  calculation  of  the  predicted  failure 
rate  of  a  toggle  switch  by  the  method  of  MIL-HDBK-217  and  Equation  (3-17).  The 
tables  which  are  used  come  directly  from  MIL-HDBK-217.  The  example  is  as  follows: 

Given:  A  MIL-SPEC  toggle  switch  is  used  in  a  ground  fixed  environment.  The 
switch  is  a  snap-action  and  is  single-pole,  double-throw.  It  is  operated  on 
the  average  of  one  cycle  per  hour,  and  the  load  current  is  50  percent  of 
rated  and  is  resistive. 

Find:  The  failure  rate  of  the  switch. 
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Step  1:  The  base  failure  rate  (Xp)  is  found  in  Table  3.5-1  and  is  determined  to 
be  0.00045  failures  / 106  hours 

Step  2:  The  environmental  factor  7tE  for  ground  fixed  environment  is 
determined  from  Table  3.5-2  to  be  3.0. 

TABLE  3.5-2:  ENVIRONMENTAL 
FACTOR  -  jce 


Step  3:  The  contact  from  factor  kq  is  determined  from  Table  3.5-3.  For  a 
single-pole,  double- throw  switch,  kq  is  1.7. 

Step  4:  The  cycling  factor  is  determined  from  Table  3.5-4  to  be  equal  to  1.0. 

Step  5:  The  stress  factor  7tE  from  Table  3.5-5  for  50  percent  stress  factor  and  a 
resistive  load  is  determined  to  be  1.48. 


Environment 

7tE 

gb 

1.0 

Gf 

3.0 

gm 

18 

NS 

8.0 

Nu 

29 

AIC 

10 

aif 

18 

Auc 

13 

auf 

22 

arw 

46 

SF 

.50  | 

Mp 

25 

ml 

67 

cl 

1,200 

TABLE  3.5-1:  BASE  FAILURE  RATES, 
Jtb,  FOR  SWITCHES 


Description 

MIL-SPEC 

Lower 

Quality 

Snap-action 

0.00045 

0.034 

Non-snap  action 

0.0027 

0.04 
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TABLE  3.5-3:  jcc  FACTOR  FOR  TABLE  3.5-4:  KcyC  FACTOR  FOR 

CONTACT  FORM  AND  QUALITY  CYCLING  RATES 


Switching  Cycles 
per  Hour 

kcyc 

<  1  cycle/hour 

1.0 

>  1  cycle/hour 

number  of  cycles/hour 

Contact  Form 

*C 

SPST 

1.0 

DPST 

1.5 

SPDT 

1.7 

3PST 

2.0 

4PST 

2.5 

DPDT 

3.0 

3PDT 

4.2 

4PDT 

5.5 

6PDT 

8.0 

TABLE  3.5-5:  jcl  STRESS  FACTOR  FOR  SWITCH  CONTACTS 


Stress  S 

Load  Type 

Resistive 

Inductive 

Lamp 

0.05 

1.00 

1.05 

1.06 

0.1 

1.02 

1.06 

1.28 

0.2 

1.06 

1.28 

2.72 

0.3 

1.15 

1.76 

9.49 

0.4 

1.28 

2.72 

54.6 

0.5 

1.48 

4.77 

0.6 

1.76 

9.49 

0.7 

2.15 

21.4 

0.8 

2.72 

0.9 

3.55 

1.0 

4.77 

where. 


operating  load  current 
rated  resistive  load  current 
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Step  6:  The  failure  rate  mathematical  model  for  toggle  switches  is  given  by: 

A.p  =  (lt£  X  X  ItcyQ  X  7Cl  ) 

Substituting  the  value  for  these  factors  yields  the  failure  rate. 

Xp  =  0.00045(3.0x1.7x1.0x1.48) 

Xp  =  0.0034  failures/ 106  hours 

3.5.2  Average  Cumulative  Hazard  Rate  Analysis 


The  hazard  rates  experienced  by  many  mechanical  parts  are  not  constant  but 
rather  vary  as  a  function  of  time.  The  average  cumulative  hazard  rate  method 
reduces  a  time  dependent  hazard  rate  into  a  single  average  failure  rate  which  can  be 
treated  as  a  constant  failure  rate  for  a  specified  time  interval.  This  constant  failure 
rate  can  then  be  used  in  constant  failure  rate  reliability  predictions.  A  disadvantage 
of  this  method  is  it  reduces  the  accuracy  of  the  original  failure  model  to  that  of  a 
constant  hazard  rate  over  the  specified  time  interval. 


The  following  is  a  list  of  items  required  when  utilizing  the  average  cumulative 
hazard  rate  method  for  components  exhibiting  a  Weibull  failure  distribution: 


(a)  Weibull  slope  or  shape  parameter,  P 

(b)  Component  service  life,  x  (e.g.,  warranty  time,  time-to-overhaul,  design  life) 

(c)  A  point  estimate  of  life  that  a  stated  percentile  of  the  component  population 
will  complete  or  exceed  without  failure  (e.g.,  Ljq  life  or  Weibull 
characteristic  life) 


The  conditional  hazard  rate,  h(x)  was  presented  in  Section  2.8.  The  hazard  rate 
was  defined  in  Equation  (2-23)  as: 


h(x)  = 


f(x) 

1  -  F(x) 


An  average  failure  rate  (X,avg)  can  be  determined  for  the  purpose  of  comparison 
with  other  constant  failure  rate  components.  The  average  failure  rate  over  an 
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interval  X|  to  X2  can  be  defined  as: 

A,ave  =  - - -  f"2  h(x)dx  (3-19) 

8  x2  -  xl  Jx, 


If  the  average  failure  rate  to  any  time  (x)  from  time  zero  is  utilized.  Equation  (3- 
19)  reduces  to: 


^avg  -  Jn  h(x)dx 


(3-20) 


However,  the  integral  of  the  hazard  function,  h(x),  is  the  cumulative  hazard 
function,  H(t).  Equation  (3-20)  becomes: 

Xavg  =  H(x)  /  x  =  H(x)  (3-21) 

Equation  (3-21)  represents  the  average  cumulative  hazard  rate  or  the  average 
failure  rate  based  on  the  condition  that  the  part  has  not  yet  failed  to  time  x. 


The  hazard  function,  h(x),  corresponding  to  the  Weibull  distribution  was 
presented  in  Section  2.9.2,  Equation  (2-46)  as: 


h(x)  = 


(3-22) 


Substituting  Equation  (3-22)  into  Equation  (3-20)  and  letting  xG  =  0  yields  the 
average  cumulative  hazard  function  or  the  average  failure  rate  for  the  Weibull 
distribution: 

A,avg  =  H(x)  =  xP'1  /  aP  (3-23) 


The  utility  of  Equation  (3-23)  was  mentioned  earlier  in  Section  3.5.1,  Equation  (3- 
18).  Notice  that  the  motor  model.  Equation  (3-18)  of  MIL-HDBK-217,  has  the  form  of 
Equation  (3-23).  Substitution  of  (3  equal  to  3  into  Equation  (3-23)  yields  the  bearing 
portion  of  the  model  and  p  equal  to  1  yields  the  winding  portion  of  the  motor 
model.  This  motor  model  is  an  example  of  combining  average  failure  rates  to 
develop  a  competing  risk  model. 
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The  average  cumulative  hazard  rate  can  only  be  used  effectively  if  the  limitation 
of  this  reliability  measure  is  understood.  The  primary  limitation  of  this  method  is: 

•  The  average  cumulative  hazard  rate  provides  a  valid  approximation  over 
relatively  short  intervals  of  the  part  service  life.  The  average  cumulative 
hazard  rate  is  only  a  gross  approximation  and  should  only  be  used  when  a 
constant  failure  rate  measure  is  required. 


3.5.3  Life  Assessment  Model  For  Rolling  Element  Bearings 

Using  the  average  cumulative  hazard  rate  method  to  determine  the  average 
failure  rate  of  a  rolling  element  bearing  is  now  considered.  The  following  five  steps 
are  used  to  calculate  the  average  failure  rate  for  a  double  rowed  tapered  roller 
bearing  subject  to  a  purely  radial  load,  F,  of  1,000  pounds  and  a  speed  of  60,000 
revolutions  per  hour.  The  basic  dynamic  capacity,  C,  of  the  bearing  is  11,700  pounds. 
Determine  the  average  failure  rate  for  50,000  hours  of  constant  operation. 


Step  1:  Identify  the  following  bearing  characteristics: 

(a)  double  rowed  tapered  roller  bearing 

(b)  basic  dynamic  capacity  (from  manufacturer)  is  11,700  pounds 

(c)  Weibull  slope  (P)  for  tapered  roller  bearing  is  1.5  (manufacturer) 

(d)  load  life  exponent  (from  empirical  data)  is  10/3  for  roller  bearings 


Step  2:  The  standard  equation  for  determining  the  Ljq  life  of  bearings,  which 
can  be  found  in  most  references  in  bearing  selection,  was  developed  in 
the  1940s  by  Lundberg  and  Palmgren  and  has  been  the  accepted 
criterion  by  the  rolling  element  bearing  industry  since  1947  when  their 
papers  were  published.  The  Ljq  life  equation  is  given  in  Equation  (3- 
24): 


Lio 


(revolutions) 


(3-24) 


where, 

L10  =  the  number  of  revolutions  that  90%  of  a  population  of 
bearings  will  complete  or  exceed  without  failure 
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C  =  the  basic  load  rating  in  pounds 
F  =  the  equivalent  radial  load  in  pounds 
P  =  3  for  ball  bearings,  10/3  for  roller  bearings  (from  empirical 
data) 

Equation  (3-24)  can  be  used  to  determine  the  point  estimate  of  bearing 
life  that  90%  of  the  population  will  complete  or  exceed  without  failure: 


Mo 


"n,7oo  yo/3 

,  1,000  J 


xlO6 


Ljo  =  3.6  x  109  (revolutions) 


Step  3:  Convert  L10  (revolutions)  to  L10  (hours)  given  the  speed  of  60,000 
revolutions  per  hour: 

Ljq  (hours)  =  Ljo  (revolutions)  /  revolutions  per  hour 

L10  (hours)  =  3.6  x  109  /  60,000  (3-25) 

L10  (hours)  =  60,000 


Step  4:  A  graphical  method  using  Ljo  (hours)  and  the  Weibull  slope  (p)  is 
applied  to  determine  the  Weibull  characteristic  life  value  (a).  The 
Weibull  characteristic  life  is  defined  as  the  life  that  36.8%  of  the 
population  will  complete  or  exceed  without  failure.  Figure  3.5-1  shows 
the  point/ slope  analysis  using  Weibull  probability  paper.  The 
derivation  of  the  characteristic  life  percentile  and  the  Weibull  axis 
parameters  is  given  in  Appendix  A.  The  Weibull  characteristic  life 
value  is  approximated  to  be  2.7  x  105  hours. 

The  Weibull  characteristic  life  value  can  also  be  derived  analytically 
from  the  Weibull  cumulative  density  function  as: 


F(x)  =  1  -  exp 


exp 


1  -  F(x) 


(3-26) 


Reliability  Analysis  Center  (RAC)  •  201  Mill  St.,  Rome,  NY  13440-6916  •  (315)  337-0900 


PERCENT  FAILURE 


86 


Mechanical  Applications  in  Reliability  Engineering  -  NPS 


HOURS 

FIGURE  3.5-1:  WEIBULL  POINT/SLOPE  ANALYSIS  TO  DETERMINE 
THE  WEIBULL  CHARACTERISTIC  LIFE  VALUE  (a) 
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g)T-.Inp-FW] 

Substitution  of  the  Ljq  life  and  the  Weibull  slope  (J3)  yields: 

60,000 

“  =  {-ln[l  -  .ID]}1'1-5 
a  =  268,967  (hours) 

Step  5:  Calculate  the  average  failure  rate  for  50,000  hours  of  constant  operation 
using  Equation  (3-23): 

^avg  =  H(x)  =  x^'1  /  a** 

Kvg  =  (50, 000 )'5  /  (2.7  x  105)1-5 
Xavg  =  1.6  x  10'6  (failures  /  hour) 

Xavg  =  16  (failures /106  hours) 

3.6  Mechanical  Stress /Strength  Interference  Analysis 

Stress /Strength  Interference  Analysis  is  a  practical  engineering  tool  used  for 
designing  and  quantitatively  predicting  the  reliability  of  mechanical  components 
subjected  to  mechanical  loading.  The  method  was  originally  presented  in  the 
following  technical  reports: 

1)  Lipson,  C.,  et.  al..  Reliability  Prediction  Mechanical  Stress /Strength 
Interference  Models.  RADC-TR-66-710,  March  1967.  (AD/813574). 

2)  Lipson,  C.,  et.  al..  Reliability  Prediction  Mechanical  Stress /Strength 
Interference  (Nonferrous),  RADC-TR-68-403,  December  1968.  (AD/856021). 

These  reports  are  still  available  and  can  be  obtained  from: 

National  Technical  Information  Service  (NTIS) 

Department  of  Commerce 
5285  Port  Royal  Road 
Springfield,  VA  22161-2171 

(703)487-4650 
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or 

Defense  Technical  Information  Center  (DTIC) 

DTIC-FDAC 

Cameron  Station,  Bldg.  5 

Alexandria,  VA  22304-6145 

(703)  274-7633  DSN:  284-7633 

The  Mechanical  Stress /Strength  Interference  Analysis  is  a  reliability  prediction 
technique  which  requires  probabilistic  knowledge  of  stress  and  strength  in  a  part  at  a 
particular  point  in  time.  Application  of  this  method  requires  that  the  strength 
distribution  parameters  be  known.  The  technical  reports  mentioned  above  have 
extensive  fatigue  strength  distribution  tables  for  numerous  ferrous  and  nonferrous 
metals.  Included  are  the  effects  of  heat  treatment,  surface  finish,  stress  concentrators 
and  temperature.  The  reliability  of  an  unlimited  number  of  mechanical 
components  can  be  determined  using  this  method.  The  precision  of  this  method 
depends  to  a  large  extent  on  the  accuracy  to  which  the  stress  distribution  can  be 
estimated.  Ways  to  determine  the  stress  distribution  may  include  actual  stress 
measurement  or  simulated  stress  measurement  using  finite  element  analysis. 

The  following  is  a  list  of  data  items  required  when  utilizing  Mechanical 
Stress /Strength  Interference  Analysis: 

a.  engineering  knowledge  of  the  stress  distribution 

b.  engineering  knowledge  of  the  strength  distribution 

1.  alloy 

2.  design  life 

3.  loading  type  (axial,  bending,  torsion) 

4.  surface  finish  (polished,  ground,  etc.) 

5.  heat  treatment  (annealed,  quenched,  etc.) 

6.  operating  temperature 

7.  stress  concentration  factor(s) 

The  data  items  in  (1)  through  (7)  above  can  be  used  to  establish  the  strength 
distribution. 
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The  method  treats  both  stress  and  strength  as  random  variables  subject  to  natural 
scatter.  The  variability  in  these  two  factors  results  in  statistical  distributions  of  stress 
and  strength  defined  by  probability  density  functions.  The  area  of  intersection 
created  by  these  distributions  is  referred  to  as  the  "interference"  which  is  shown 
graphically  in  Figure  3.6-1.  If  failure  is  defined  by: 

(Stress  >  Strength)  =  failure  (3-27) 

Interference,  when  evaluated  properly,  represents  the  probability  that  a  random 
observation  from  the  stress  distribution  exceeds  a  random  observation  from  the 
strength  distribution  or: 

P  (Stress  >  Strength)  =  interference  (3-28) 

So,  interference  represents  the  probability  of  failure.  The  reliability  (R)  then  is 
expressed  as: 

R  =  1  -  interference  (3-29) 


FIGURE  3.6-1:  INTERFERENCE  OF  STRESS  AND  STRENGTH  DISTRIBUTION  [48] 
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The  means  of  expressing  the  stress  and  strength  distributions  and  calculating  the 
resulting  interference  created  by  these  distributions  are  contained  in  RADC-TR-66- 
710  (Reference  [42])  and  RADC-TR-68-403  (Reference  [41]).  In  these  technical  reports, 
this  method  is  applied  to  components:  a)  subjected  to  completely  reversed  cyclic 
bending,  axial  or  torsional  loading  or  b)  subjected  to  a  combination  of  static  and 
cyclic  loads. 


The  original  technical  reports  provide  extensive  tables  giving  the  interference 
(probability  of  failure)  as  a  function  of  distribution  parameters  for  the  following 
combinations  of  distribution: 


Strength  Distribution 

Stress  Distribution 

Weibull 

Normal 

Weibull 

Weibull 

Normal 

Normal 

Largest  Extreme-Value 

Normal 

Smallest  Extreme- Value 

Normal 

The  assumption  of  a  normal  stress  and  strength  distribution  is  by  far  the  most 
popular  in  discussions  of  stress /strength  inference  analysis  because  the  mathematics 
is  easily  managed.  But  an  assumption  of  a  normally  distributed  stress  or  strength 
distribution  should  always  be  justified  with  sound  engineering  rationale. 

3.6.1  Application  of  Stress/Strength  Interference 

Application  of  the  Stress /Strength  Interference  Method  for  estimating  the 
reliability  of  mechanical  components  takes  several  forms  depending  upon  the 
choice  of  stress  and  strength  distributions.  Numerical  examples  illustrating  the  use 
of  this  method  are  presented  in  this  section.  Further  examples  can  be  found  in  the 
technical  reports  RADC-TR-66-710,  RADC-TR-68-403  and  also  in  the  document 
Nonelectronic  Reliability  Notebook  (RADC-TR-85-194).  Three  separate  cases  are 
considered: 

Casel:  The  assumption  is  made  that  the  standard  deviation  of  the  service 
stress  distribution  is  negligible  compared  to  the  mean  stress.  The 
fatigue  strength  distribution  parameters  are  selected  from  the  alloy 
tables  in  the  source  documents.  A  typical  Case  1  situation  is  shown  in 
Figure  3.6-2. 
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Case  2:  The  assumption  is  made  that  the  service  stress  has  a  normal 
distribution  and  the  standard  deviation  is  a  fixed  percentage  of  the 
mean.  The  fatigue  strength  distributions  and  their  parameters  are 
selected  from  the  alloy  tables  in  the  source  documents.  A  typical  Case  2 
situation  is  shown  in  Figure  3.6-3. 

Case  3:  To  use  the  interference  method,  a  simple  completely  reversed  cyclic 
stress  must  be  stated.  Case  3  considers  an  example  when  a  cyclic  stress 
has  a  variable  magnitude  (stress  spectrum).  A  reduction  method  is 
applied  which  converts  the  stress  spectrum  to  a  simple  cyclic 

equivalent  (SeqU)  as  graphically  represented  in  Figure  3.6-4.  The 

fatigue  strength  distributions  and  their  parameters  are  selected  from 
the  alloy  tables  in  the  source  documents. 

A  numerical  example  of  each  of  the  three  cases  are  now  presented: 


FIGURE  3.6-2:  INTERFERENCE  WITH  STANDARD 
DEVIATION  OF  STRESS  EQUAL  TO  ZERO  [42] 


Case  1:  Example 

A  cylindrical  component  is  to  be  subjected  in  service  to  completely  reversed 
bending  stresses  of  ±23.6  ksi  at  ambient  temperature.  The  design  life  is  106  cycles. 
Estimate  the  reliability  if  the  component  is  constructed  of  hot  rolled  aluminum 
alloy  2014,  heat-treated  to  the  T6  condition,  and  mechanically  polished. 
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FIGURE  3.6-3:  EFFECTS  ON  INTERFERENCE  WITH  CHANGES  IN 
STANDARD  DEVIATION  (c)  OF  THE  STRESS  DISTRIBUTION  [42] 
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FIGURE  3.6-4:  CONVERSION  OF  STRESS  SPECTRUM  TO 
EQUIVALENT  STRESS  [42] 
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Solution: 

The  appropriate  table  is  found  in  RADC-TR-68-403  on  page  182  (Code  37)  which 
is  reproduced  in  Table  3.6-1.  The  density  function  for  hot  rolled  2014-T6  is  the 
Smallest  Extreme-Value  (S.E.V.),  a  two-parameter  distribution.  At  106  cycles,  the 
parameters  are: 

P  =  .4523 

M  =  29.74 

The  stress  level  is  given  as  SgqU  =  23.6  ksi 

Compute: 

X  =  -p(SeqU-M) 

X  =  -.4523(23.6  -  29.74)  (3-30) 

X  =  2.78 

Entering  the  table  on  page  419  of  RADC-TR-68-403  which  is  reproduced  in  Table 
3.6-2  gives  an  "interference"  of  .0602  by  interpolating  for  a  value  of  X  =  2.78.  The 
reliability  estimate  is  given  by: 

R  =  1.0  -  Interference  (3-31) 

R  =  1.0  -  .0602 

R  =  .9398 

Case  2:  Example 

A  cylindrical  component  is  to  be  subjected  in  service  to  completely  reversed 
bending  stresses  of  ±23.6  ksi  at  ambient  temperature.  The  design  life  is  106  cycles. 
Estimate  the  reliability  if  the  component  is  constructed  of  aluminum  alloy  7079- 
T652,  forged,  and  mechanically  polished. 
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TABLE  3.6-1:  RADC-TR-68-403  ALLOY  TABLE  FOR  2014  ALUMINUM 
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TABLE  3.6-2:  RADC-TR-68-403  INTERFERENCE  TABLE 


X 

Interference  F(X) 

II  X 

Interference  F(X) 

0.0 

.6321 

r  2.8 

.0590 

0.1 

.5954 

2.9 

.0540 

0.2 

.5590 

3.0 

.0459 

0.3 

.5233 

3.1 

.0441 

0.4 

.4884 

3.2 

.0400 

0.5 

.4548 

3.3 

.0362 

0.6 

.4224 

3.4 

.0328 

0.7 

.3914 

3.5 

.0298 

0.8 

.3619 

3.6 

.0270 

0.9 

.3341 

3.7 

.0244 

1.0 

.3078 

3.8 

.0221 

1.1 

.2831 

3.9 

.0200 

1.2 

.2601 

4.0 

.0182 

1.3 

.2385 

4.2 

.0149 

1.4 

.2185 

4.4 

.0122 

1.5 

.2000 

4.6 

.0100 

1.6 

.1829 

4.8 

.0082 

1.7 

.1669 

5.0 

.0067 

1.8 

.1524 

5.2 

.0055 

1.9 

.1389 

5.4 

.0045 

2.0 

.1266 

5.6 

.0037 

2.1 

.1153 

5.8 

.0030 

2.2 

.1049 

6.0 

.0025 

2.3 

.0954 

6.2 

.0020 

2.4 

.0868 

6.4 

.0017 

2.5 

.0788 

6.6 

.0014 

2.6 

.0716 

6.8 

.0011 

2.7 

.0650  1 

Stress  Distribution  -  Normal  (a  =  0) 
Strength  Distribution  -  Smallest  Extreme  Value 
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Solution: 


The  appropriate  table  is  found  in  RADC-TR-68-403  on  page  233  (code  288)  which 
is  shown  in  Table  3.6-3.  The  density  function  for  the  fatigue  strength  is  Weibull,  a 
three-parameter  distribution.  At  106  cycles,  the  Weibull  parameters  are: 

b  =  3.096 
0  =  28.96 

x0  =  20.51 

For  this  example,  the  standard  deviation  is  assumed  by  engineering  experience  to  be 
5  percent  of  the  mean;  the  actual  standard  deviation  should  be  determined 
experimentally  based  on  the  particular  situation  when  possible. 

Sequ  =  23.6  ksi 
a  =  .05(23.6)  =  1.2  ksi 


Interference  =  .0688 

The  reliability  is: 

R  «  1.0  -  .0688 

R  =  .9312 
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TABLE  3.6-3:  RADC-TR-68-403  ALLOY  TABLE  FOR  7079  ALUMINUM 
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Case  3:  Example 


A  design  for  an  aircraft  part  specifies  the  following: 


Material: 

Design  Life: 

Type  of  Loading: 

Size: 

Surface  Finish: 

Stress  Concentration 
Factor: 

Operating  Temperature: 


2024  aluminum  (Su  =70  ksi,  Sy  =  50  ksi) 

106  cycles 
Axial 
.125"  sheet 
Electropolished 

Kt  =  1.5  (Milled  edge) 

Room  Temperature 


The  first  step  is  to  determine  the  strength  distribution  parameters  corresponding 
to  the  design  conditions.  The  strength  distribution  at  106  cycles  is  found  in  the  alloy 
tables  in  RADC-TR-68-403  on  page  194  (Code  No.  102)  which  is  shown  in  Table  3.6-5. 
The  strength  distribution  is  the  smallest  extreme  value  and  the  parameters  are: 

p  =  .8348 
M  =  19.98  ksi 


In  order  to  determine  the  stress  distribution  parameters  (a  =  SeqU  and  K  =  |i),  a 

prototype  of  this  aircraft  part  was  instrumented  and  the  stress  spectrums  were 
recorded  as  shown  in  Columns  1  and  2  of  Table  3.6-6. 

The  first  step  to  determine  the  equivalent  stress  (seqU  j  is  to  apply  Miner's  rule4 
to  determine  the  equivalent  number  of  cycles  to  failure  (NgqUj  from  the  stress 

spectrum  data  in  Table  3.6-6.  Miner's  rule,  also  known  as  the  "Palmgren-Miner 
cycle-ratio  summary  theory,"  analyzes  cumulative  fatigue  damage.  This  theory 
states  that  at  the  point  of  failure. 


X 


(3-34) 


4  Miner,  M.A.,  "Cumulative  Damage  In  Fatigue."  Toumal  of  Applied  Mechanics.  Volume  12, 1945. 
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TABLE  3.6-5:  RADC-TR-68-403  ALLOY  TABLE  FOR  2024  ALUMINUM 
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TABLE  3.6-6:  STRESS  AND  LIFE  DATA  FOR  MINER'S  RULE 


Stress  Spectrum 

Miner's  Rule  Data 

Completely*  Reversed 

Number  of 

Cycles  to 

«i 

Axial  Stresses,  ksi 

Occurrences,  n* 

Failure,  Nj 

Nj 

1 

2 

3 

4 

11.7 

240 

5.1  x  107 

4.70  x  10'6 

12.5 

217 

3.0  x  107 

7.24  x  10'6 

13.0 

176 

2.1  x  107 

8.39  x  10"6 

13.8 

150 

1.3  x  107 

11.52  x  10‘6 

14.1 

110 

1.1  x  107 

10.00  x  10‘6 

14.9 

75 

7.2  x  106 

10.40  x  10'6 

15.7 

52 

4.8  x  106 

10.81  x  10‘6 

15.9 

20 

4.4  x  106 

4.55  x  10*6 

16.0 

5 

4.2  x  106 

1.19  x  10-6 

Sn|  =  1035 

X  ^  =  68.80  x  10*6 

^Actually,  the  stress  was  not  completely  reversed.  It  was  reduced  with  the  aid 
of  a  Goodman  diagram  to  a  completely  reversed  stress. 


where, 

nj  =  number  of  cycles  of  stress  (s) 

Nj  =  life  to  failure  corresponding  to  stress  (s) 

C  =  a  constant  determined  experimentally  (usually  found  in  the  range  0.7  < 
C  <  2.2  but  many  authorities  recommend  using  1.0) 

The  number  of  cycles  to  failure  (Nj)  corresponding  to  the  stresses  in  Column  1  of 
Table  3.6-6  are  determined  from  the  S-N  curves  of  the  material  shown  in  Figure  3.6- 
5.  The  results  are  recorded  in  Column  3  of  Table  3.6-6. 
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FIGURE  3.6-5:  S-N  DIAGRAM  FOR  2024  ALUMINUM  (SHEETS) 


Using  Miner’s  Rule,  Equation  (3-34),  and  the  tabulated  data  in  Table  3.6-6,  an 
equivalent  number  of  cycles  to  failure  (Nequ  j  is  determined  as  follows: 


N, 


equ 


_  S>i 


2^- 

N4 


N, 


1035 


equ  68.80  x  W6 


-  1.51  x  107  cycles 


(3-35) 


From  the  S-N  curve  (Figure  3.6-5)  the  stress  corresponding  to  Nequ  =  1.51  x  107 
cycles  was  extrapolated  to  approximately: 

Sequ  =  13.5  ksi 
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In  some  engineering  applications,  the  scatter  in  operating  stresses  is  very  small. 
Therefore,  in  these  applications  the  standard  deviation  of  the  equivalent  stress  can 
be  assumed  to  be  zero.  In  those  engineering  applications  where  the  scatter  in  stress 
is  appreciable,  the  standard  deviation  typically  lies  in  the  range, 

0.01  p  <  a  <  0.10  p,  where  p  is  the  mean  stress 

In  the  absence  of  any  specific  information,  an  average  value  of  a  =  .05p  can  be 
assumed  as  an  approximation.  For  the  present  problem,  interference  will  be 
calculated  for  the  two  cases:  c  =  0  and  a  =  .05p. 

Thus,  the  stress  parameters  are: 

1.  p=SeqU  =  13.5  ksi  and  o  =  0 

2.  p  =  13.5  ksi  and  a  =  .05p  =  .05  (13.5)  =  .675  ksi 

Once  the  strength  and  stress  distribution  parameters  are  determined,  the 
interference  and  the  reliability  (Reliability  =  1.0  -  Interference)  can  be  determined. 

For  the  case  where  stress  is  normally  distributed  with  p  =  SeqU  and  o  =  0  and  a 
Smallest  Extreme-Value  distribution  of  strength,  the  interference  can  be  determined 
from: 

SeqU  =  13.50  ksi 
p  =  .8348 
M  =  19.98 

Compute: 

X  =  -p(Sequ-M)  (3-36) 

X  =  -.8348(13.50-19.98) 

X  =  5.404 
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From  the  interference  table  contained  in  RADC-TR-68-403  on  page  419  which  is 
shown  in  Table  3.6-7,  the  interference  is  determined  to  be  .0045.  The  reliability  is 
then  computed  as: 

R  =  1  -  Interference  (3-37) 

R  =  1  -  .0045 

R  =  .9955 


For  cases  where  stress  is  normally  distributed  with  a  *  0  and  strength  is 
distributed  according  to  the  Smallest  Extreme-Value  (S.E.V.),  the  interference  is 
determined  as  follows: 

Strength  (S.E.V.)  Stress  (Normal! 

p  =  .8348  P  =  SeqU  =  13.50  ksi 

M  =  19.98  ksi  a  =  .05p  =  .675  ksi 


Compute: 

a  =  Pa  =  (.8348X.675)  =  .564 


(3-38) 


and. 


Y  =  P(SeqU-M)  =  (.8348X13.50- 19.98)  =  -5.4 

and  from  the  table  on  page  422  of  RADC-TR-68-403,  which  is  shown  in  Table  3.6-8, 
the  interference  value  corresponding  to  these  parameters  is  determined  using  linear 
interpolation  to  be  .0059.  The  corresponding  component  reliability  is  calculated  to 
be: 


R  =  1  -  Interference  (3-39) 

R  =  1  -  .0059 

R  =  .9941 
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TABLE  3.6-7:  RADC-TR-68-403  INTERFERENCE  TABLE 


X 

Interference  F(X) 

X 

Interference  F(X) 

:  o.o 

.6321 

2.8 

0.1 

.5954 

2.9 

0.2 

.5590 

3.0 

.0459 

0.3 

.5233 

3.1 

.0441 

0.4 

.4884 

3.2 

.0400 

0.5 

.4548 

3.3 

.0362 

0.6 

.4224 

3.4 

.0328 

0.7 

.3914 

3.5 

.0298 

0.8 

.3619 

3.6 

.0270 

0.9 

.3341 

3.7 

.0244 

1.0 

.3078 

3.8 

.0221 

1.1 

.2831 

3.9 

.0200 

1.2 

.2601 

4.0 

.0182 

1.3 

.2385 

4.2 

.0149 

1.4 

.2185 

4.4 

.0122 

1.5 

.2000 

4.6 

.0100 

1.6 

.1829 

4.8 

.0082 

1.7 

.1669 

5.0 

.0067 

1.8 

.1524 

5.2 

.0055 

1.9 

.1389 

5.4 

.0045 

2.0 

.1266 

5.6 

.0037 

2.1 

.1153 

5.8 

.0030 

2.2 

.1049 

6.0 

.0025 

2.3 

.0954 

6.2 

.0020 

2.4 

.0868 

6.4 

.0017 

2.5 

.0788 

6.6 

.0014 

2.6 

.0716 

6.8 

.0011 

2.7 

.0650 

Stress  Distribution:  Normal  (a  =  0) 
Strength  Distribution:  Smallest  Extreme  Value 
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TABLE  3.6-8:  RADC-TR-68-403  INTERFERENCE  TABLE 
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3.6.2  Using  Interference  Analysis  to  Evaluate  Part  Geometry 

Component  dimensions  such  as  lengths,  widths,  thickness  and  other  physical 
features  comprise  one  class  of  variables  of  prime  importance  in  mechanical  design. 
Dimensionally  described  geometry  and  especially  random  geometric  variations 
created  by  the  production  process,  impose  sizable  direct  influences  on  the  probability 
of  failure  of  components  and  systems.  Statistical  dimensional  descriptions  and 
analyses,  like  the  one  that  follows,  provide  an  opportunity  to  account  for  machining 
and  processing  variability  in  the  design  process  itself  and  estimate  the  impact  of 
dimensional  variability  on  the  performance  of  a  population  of  like  mechanical 
components. 


The  method  of  interference  theory  (refer  to  Figure  3.6-6)  was  employed  to 
evaluate  the  probability  of  losing  an  interference  fit  strictly  based  on  the  expected 
geometric  variations  in  a  stator  assembly.  The  variables  shown  in  Figure  3.6-7  were 
used  to  identify  the  inside  and  outside  unassembled  diameters  for  the  housing, 
intermediate  ring  and  lamination  stack,  which  were  applied  to  this  evaluation. 
Table  3.6-9  identifies  the  dimensions  used  in  the  analysis. 


Probability  of  Maintaining  Interference  Fit  =  1  -  J°°  f(Y)dY 

Probability  of  Exceeding  0  Clearance  =  J  f (Y)dY 

FIGURE  3.6-6:  INTERFERENCE  THEORY  AS  APPLIED  TO  EVALUATING  THE 
PROBABILITY  OF  LOSING/MAINTAINING  INTERFERENCE  FIT 
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Inner  Ring 


Outer  Ring 


FIGURE  3.6-7:  COMPONENT  GEOMETRY  VARIABLES 


TABLE  3.6-9:  RING  GEOMETRY 


Component 

Geometry 

Variable 

Description 

Unassembled 
Dimensional 
Tolerances  (Inches) 

Mean  Value*, 

Dj  (Inches) 

Standard 
Deviation*  ci 
(Inches) 

Do 

Inside  Diameter,  Inner  Ring 

3.550/3.530 

3.54 

.0033 

Di 

Outside  Diameter,  Inner  Ring 

3.8225/3.8220 

3.82225 

.000083 

d2 

Inside  Diameter,  Outer  Ring 

3.82266/3.82166 

3.82216 

.000167 

D3 

Outside  Diameter,  Outer  Ring 

3.9435/3.9425 

3.9430 

.00016 

^Assumed  Normal  Distribution 


The  following  model  was  then  established  to  evaluate  the  interference  fit  (refer 
to  Figure  3.6-6  and  Table  3.6-9): 

Y  =  D2  -  D|  (3-40) 

Note:  Y  is  a  new  variable  that  must  be  negative  to  maintain  an  interference  fit. 

From  the  algebra  of  expectation  of  random  variables  (Reference:  "Probabilistic 
Mechanical  Design",  by  E.B.  Haugen),  the  following  equations  can  be  applied  to 
calculate  the  mean  and  standard  deviation  for  the  new  variable,  Y. 
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Yi  =  D2  -  Di 
GY  =  A/al  +  a2 


(3-41A) 
(3-41 B) 


Substituting  the  values  of  Di  and  D2  into  Equation  (3-41A)  yields: 


Y  =  D2  -  Di 

Y  =  3.82216  -  3.82225 

Y  =  -.00009 

Substitution  of  the  values  of  Oj  and  (5 2  into  Equation  (3-41B)  yields: 
cry  =  ye*  +  CJ2 

oY  =  V- 00083 2  +  .0001672 


oY  =  .0001865 
therefore, 

(Y,  aY)  =  (-.00009,  .0001865) 

The  probability  of  maintaining  an  interference  fit  (P)  is  shown  graphically  in 
Figure  3.6-6  and  is  given  by  the  following  equation: 


P  =  1  -  r  f(Y)  dY 

Jn 


(3-42) 


Now,  assuming  the  standard  normal  distribution,  the  following  equation  results: 


p  - 1  -  J 


b  e 

a 


-u2/2 


du 


(3-43) 


The  limits  of  integration  associated  with  the  standard  normal  distribution  are 
calculated  and  the  probability  of  maintaining  an  interference  fit  is  determined  using 
Table  3.6-10. 
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TABLE  3.6-10:  AREAS  UNDER  THE  STANDARD  NORMAL  CURVE 
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The  calculations  are  as  follows: 

For  variable  Y:  (Y,  Gy)  =  (-.00009,  .0001865),  calculate  the  lower  limit  of 
integration  'a'  at  Y  =  0  as  follows: 

Y  -  Y  0  -  (-.00009) 
a  “  gy  “  .0001865 

a  =  .48 

Calculate  upper  limit  'b'  at  Y  =  <»  as  follows: 

b  =  Y  -  X.  =  - :  t--00009) 

gy  .0001865 

b  =  oo 


Now,  substitute  the  new  limits  of  integration  into  Equation  (3-43)  and  utilize 
Table  3.6-10  to  evaluate  the  integral. 


P  =  1  -  r 

.48 


2-u2/2 
-y/27 Z 


du 


P  =  .6844 

The  sensitivity  of  the  current  component  geometry  to  potential  failure  in  terms 
of  losing  interference  fit  is  based  on  a  comparison  of  the  lower  limits  of  integration 
associates  with  the  standard  normal  distribution  which  is  correlated  to  the 
probability  of  failure.  As  this  lower  limit  of  integration  drops  below  3.9,  the 
probability  of  maintaining  the  interference  fit  decreases. 

3.7  Evaluating  Part  Reliability  Using  Surrogate  Data  Sources 

Surrogate  data  sources  provide  estimates  of  reliability  numerics  for  many  generic 
part  types.  These  data  sources  typically  present  data  in  the  form  of  failures /hour, 
failures /cycle  or  failures /mile.  Data  is  generally  collected  from  a  wide  range  of  part 
applications  and  operating  stress  profiles  and  grouped  together  based  on  similar  part 
types  and  similar  application  environment. 
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Currently,  most  mechanical  reliability  data  bases  find  it  necessary  to  total  the 
number  of  observed  failures  and  part  hours.  Computation  of  a  failure  rate  requires 
the  analyst  to  accept  the  assumption  that  failures  of  mechanical  parts  follow  the 
exponential  distribution  and  display  a  constant  failure  rate.  This  assumption  is 
necessary  due  to  the  virtual  absence  of  data  containing  individual  times  or  cycles  to 
failure. 

A  typical  data  set  may  include  the  following  information: 

Total  Operating  Time  =  45,875  hours 

Total  Failures  =  7 

Given  this  information  and  assuming  the  exponential  failure  model,  a  constant 
hazard  rate  is  calculated: 

7 

Failure  Rate  =  45375  =  0.000153  failures /hour 

If  the  actual  time  to  failure  data  was  available,  it  would  be  much  more 
appropriate  to  use  Weibull  Analysis.  But  in  the  majority  of  cases  the  time  to  failure 
is  unknown.  It  is  more  often  the  case  that  what  is  known  is  the  total  number  of 
failures  and  total  operating  time. 

Surrogate  data  sources  almost  invariably  represent  data  for  a  variety  of  similar 
components.  In  the  above  example,  it  may  have  been  assumed  that  all  parts  were  of 
the  same  make  and  model.  Often  enough  data  can  not  be  collected  for  a  particular 
piece  part  and  must  be  combined  with  data  from  similar  parts.  Failure  rates 
presented  in  surrogate  data  sources  may  be  the  combination  of  several  different 
parts  of  similar  design  and  function. 

The  application  of  the  parts  and  the  parts  chosen  for  the  application  can  have  a 
great  affect  on  the  failure  rates.  A  bearing  used  in  a  fighter  aircraft  will  show 
different  failure  characteristics  than  that  of  the  same  bearing  used  in  a  stationary  low 
production  milling  machine.  For  this  reason,  it  is  necessary  for  surrogate  data 
sources  to  separate  failure  information  based  on  the  operational  profile.  Some 
sources  have  twelve  or  more  different  environment  profiles.  This  attempts  to 
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account  for  different  stress  profiles  exhibited  in  various  applications  and  also  to 
differentiate  between  parts  chosen  for  a  particular  application. 

There  are  a  variety  of  shortcomings  with  surrogate  data  sources  that  are  worth 
noting.  The  underlying  failure  distribution  is  typically  assumed  to  be  exponential 
and  constant  failure  rates  are  presented.  This  is  done  for  mathematical  simplicity 
and  ease  of  data  collection.  For  mechanical  parts,  the  exponential  distribution 
assumption  may  not  be  the  most  appropriate  selection.  These  components 
generally  show  an  increased  probability  of  failure  as  operating  time  increases  due  to 
wear,  fatigue,  corrosion,  etc.  Finally,  and  perhaps  the  most  over  looked  short 
coming  of  many  surrogate  data  sources,  is  the  lack  of  a  detailed  description  of  the 
parts  that  comprise  the  constant  failure  rate.  Some  generic  reliability  databases  have 
worked  to  improve  the  level  of  detail  relating  to  their  reliability  numerics  but  much 
work  needs  to  be  done  to  improve  the  quality  of  mechanical  surrogate  data  sources. 
Some  of  the  more  popular  surrogate  data  sources  will  be  discussed  in  this  section. 

3.7.1  Nonelectronic  Parts  Reliability  Data  1991  (NPRD-91) 

NPRD-91  provides  a  comprehensive  source  of  constant  failure  rate  data  for  over 
1,400  different  part  types.  NPRD-91  is  developed  by  the  Reliability  Analysis  Center 
and  represents  a  summary  of  the  RAC  nonelectronic  database.  RAC  has  been 
compiling  this  database  since  1970  and  typically  acquires  data  from  sources  such  as: 

•  Published  reports  and  papers 

•  Data  collected  from  government  sponsored  studies 

•  Data  collected  from  military  systems 

•  Data  collected  from  commercial  systems 

•  Data  submitted  directly  to  RAC 

NPRD-91  provides  "part  summaries”  in  Section  2.0  and  "part  details"  in  Section 
3.0.  An  example  of  the  data  presented  in  the  part  summary  is  shown  in  Figure  3.7-1 
for  a  Ni-Cd  rechargeable  battery. 
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2-10  Part  Summaries 

NPRD-91  | 

Part  Description 

Qual 

Lev 

App 

Ehv 

Data 

Source 

Fail  Per  E6 
Hours 

Total 

Failed 

Operating 
Hours  (E6) 

Detail 

Page 

Battery,  Rechargeable,  Ni-Cd 

0.5197 

Gun 

GF 

NPRD-010 

0.4305 

342 

794.3832 

3-6 

Mil 

0.5452 

AIF 

25100-000 

6913.2118 

511 

0.0739 

3-6 

SF 

0.0234 

10219-034 

0.1320 

8 

60.5910 

3-6 

23020-000 

<0.0236 

0 

42.3975 

3-6 

NPRD-016 

0.0011 

8 

7438.0000 

3-6 

NPRD-120 

0.0920 

4 

43.4655 

3-6 

FIGURE  3.7-1:  NPRD-91  PART  SUMMARY  EXAMPLE 


The  data  fields  provided  in  the  part  summary  listing  shown  in  Figure  3.7-1  have 
the  following  definitions: 


•  Part  Description 


•  Qual  Lev 


•  App  Env 


Description  of  part  including  the  major  family  of 
parts  and  specific  part  type  breakdown  within  the  part 
family. 

The  Quality  Level  of  the  part  as  indicated  by: 

Mil-Parts  procured  in  accordance  with  MIL 
specifications. 

Com  -  Commercial  quality  parts. 

Unk  (Unknown)  -  Data  resulting  from  a  device 
of  unknown  quality  level 

The  Application  Environment  describes  the 
conditions  of  field  operation.  These  environments 
are  consistent  with  MIL-HDBK-217.  In  some  cases, 
environments  more  generic  that  those  used  in  MIL- 
HDBK-217  are  used.  For  example:  "A"  indicates  the 
part  was  used  in  an  Airborne  environment,  but  the 
precise  location  and  aircraft  type  was  not  known. 
Environments  proceeded  by  the  term  "No"  are 
indicative  of  non-operating  systems  in  the  specified 
environment. 


•  Data  Source  Source  of  data  comprising  this  entry.  The  source 

number  may  be  used  as  a  reference  to  review 
individual  data  source  descriptions. 

•  Failure  Rate  For  individual  data  entries  (same  part  type, 

environment,  quality,  and  source),  this  is  the  total 
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number  of  failures  divided  by  the  total  number  of 
operating  hours.  For  roll-up  data  entries  (i.e.,  those 
without  sources  listed)  failure  rate  is  derived  using 
the  data  merge  algorithm  described  in  this  section.  A 
failure  rate  preceded  by  a  "<"  is  representative  of 
entries  with  no  failures.  The  failure  rate  listed  was 
calculated  by  using  a  single  failure  divided  by  the 
given  number  of  operating  hours.  The  resulting 
number  is  a  worst  case  failure  rate  and  the  real 
failure  rate  is  less  than  this  value.  All  failure  rates 
are  presented  in  a  fixed  format  of  four  decimal  places 
after  the  decimal  point.  The  user  is  cautioned  that 
data  presented  has  inherently  high  variability  and 
that  four  decimal  places  does  not  imply  any  level  of 
precision  or  accuracy. 

•  Total  Failed  The  total  number  of  failures  observed  in  the  merged 

data  records. 

•  Operating  Hours  (E6)  The  total  number  of  operating  hours  observed  in 

merged  data  records  presented  in  millions  of  hours. 

•  Detail  Page  The  NPRD-91  page  number  containing  the  detail  data 

which  comprise  the  summary  record. 


The  detailed  section  can  be  accessed  as  indicated  by  the  "Detailed  Page"  field  to 
acquire  further  design  information  (if  available).  An  example  of  the  part  detail  is 
provided  in  Figure  3.7-2  for  the  Ni-Cd  battery. 


3-6 

Details 

Part 

NPRD-91 

Part  Qual 

App 

Data 

Part 

Fail/Hours 

Desc.  Lev 

Ehv 

Source 

Characteristics 

(E6) 

1  Battery,  Rechargeable,  Ni-Cd 

Cam 

GF 

NPRD-010 

-No  Details,  Pop:23568 

136/61.2768 

-#  Cells:3,  Pop:55, 

0/0.1430 

-#  Cells:3,  Pop:935, 

8/2.4310 

-#  Cells:21,  Pop:44, 

0/0.1144 

-#  Cells :6,  Pop:467, 

3/1.2142 

-#  Cells :4,  Pop:8387, 

13/21.8062 

-#  Cells:l,  Pop:262151, 

171/681.5926 

-#  Cells:20,  Pop:1628, 

3/4.2328 

-#  Cells :10,  Pop:3033, 

8/7.8858 

-#  Cells :8,  Pop:5264, 

0/13.6864 

Mil 

AIF 

25100-000 

-No  Details, 

511/0.0739 

Mil 

SF 

10219-034 

-No  Details,  Pop:3370 

8/60.5910 

Mil 

SF 

23020-000 

-No  Details, 

0/42.3975 

Mil 

SF 

NPRD-016 

-No  Details, 

8/7438.0000 

FIGURE  3.7-2:  NPRD-91  PART  DETAIL  EXAMPLE 
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In  summary,  NPRD-91  provides  historical  reliability  data  on  a  wide  variety  of 
part  types  and  aids  engineers  in  establishing  the  relative  reliability  of  various  part 
types. 

3.7.2  RADC  Nonelectronic  Reliability  Notebook 

The  Nonelectronic  Reliability  Notebook  is  the  result  of  research  conducted  at 
Hughes  Aircraft  Company,  Ground  System  Group,  Fullerton,  California  for  Rome 
Air  Development  Center,  currently  Rome  Laboratory.  The  purpose  of  the  notebook 
is  to  serve  as  a  reference  document  for  the  reliability  characteristics  of  the  most 
commonly  used  nonelectronic  parts.  This  intent  is  similar  to  the  purpose  of  NPRD- 
91.  The  notebook  also  presents  useful  reliability  and  life  data  analysis  methods 
applicable  to  nonelectronic  parts  such  as  discussions  of: 

•  Statistical  failure  models 

•  Design  of  statistical  experiments 

•  Estimation  methods 

•  Interference  analysis 

The  reference  section  of  this  contains  point  estimates  of  the  failure  rates  of  the 
parts  covered.  In  the  majority  of  cases,  the  nonelectronic  parts  covered  in  the 
Notebook  are  adequately  described  in  the  reliability  sense  by  a  constant  failure  rate. 
When  the  part  does  not  experience  a  constant  failure  rate,  a  Weibull  analysis  is 
presented  where  failure  times  were  available.  Data  was  screened  to  exclude 
secondary  failures  and  failures  caused  by  maintenance  personnel. 
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An  example  of  the  data  contained  in  the  Nonelectronic  Reliability  Notebook  is 
shown  below. 


Pump 

/  Vacuum  -  Lobe  Type 

|  Identification  Number  179 

Failure  Rate  (Failures  Per  Million  Hours) 

Env 

Dist  Type 

Mean 

80% 

Failure 

80% 

Estimate 

Lower 

Rate 

Upper 

(Hours) 

Bound 

Estimate 

Bound 

GF 

4091 

199.898 

244.444 

298.946 

Env 

Number  of 
Sources 

Number  of 
Parts  Failed 

Total  Part 
Operating 
Hours 

Comments 

GF 

i 

22 

90000 

|  Pump  /  Vacuum  -  Ring  Seal  Type  |  Identification  Number  180  i 

Env 

Dist.  Type 

Mean 

Estimate 

(Hours) 

Failure  Rate 
80% 

Lower 

Bound 

(Failures  Per  Mi 
Failure 
Rate 
Estimate 

lion  Hours) 

80% 

Upper 

Bound 

GF 

mmmm 

90000 

2.480 

11.111 

33.275 

Env 

Number  of 
Sources 

Number  of 
Parts  Failed 

Total  Part 
Operating 
Hours 

Comments 

GF 

i 

i 

90000 

3.7.3  IEEE-STD-500  Reliability  Data 


The  IEEE  500  Reliability  Data  book  is  a  guide  which  is  intended  to  establish  a 
consistent  method  for  collecting  and  summarizing  reliability  information  for 
electrical,  electronic,  sensing  components  and  mechanical  equipment  that  is  used  in 
the  nuclear  industry.  The  data  contained  is  intended  for  use  in  reliability  models 
developed  by  designers  and  analysts. 


For  the  purpose  of  this  edition,  (IEEE-500-1984),  a  weighted  geometric  mean  was 
chosen  as  the  maximum  likelihood  estimator  of  the  failure  rate.  The  maximum 
likelihood  estimator  was  determined  by  the  following  modification  of  the  geometric 
means  as  described  in  "A  Statistical  Model  for  Combining  Biased  Expert  Opinions", 
Los  Alamos  Report  LA-UR-83-531  by  H.F.  Martz  and  M.C.  Bryson. 

0  =  5c  OiWj  (3-44) 

i=l 
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where. 


§  =  the  maximum  likelihood  of  failure  rates  or  outage  times 

i  =  data  source  number,  1,2,3 . 

Wi  =  a  dispersion  dependent  weighting  factor  for  each  data  source 
where. 


Wi  = 


i-1 


Xj-2  =  reciprocal  of  the  variance  of  the  log  of  each  data  source 


The  data  is  presented  in  a  format  where  a  failure  rate  is  given  with  the  high  and 
low  values  representing  the  range  and  a  recommended  value  (REC)  as  a  best 
estimate.  Two  types  of  failure  rates  are  presented  where  applicable;  failures  per  106 
hours  and  failures  per  106  cycles.  Outage  times  are  also  indicated  as  well  as  the 
failure  rate  for  a  particular  failure  mode.  An  example  of  the  data  contained  in  IEEE- 
500-1984  is  shown  below. 


(Composite  of  Ref  1, 3,  and  7.1.1.X) 

CHAPTER:  7  Valve  Operators  SECTION:  7.1  Electric _ SUBSECTION:  7,1.1  Motors 

ITEM  OR  EQUIPMENT 


Failure  Mode 

Failure  Rate 

(*)  Out  of 

(t )  Repair  Time 

(Hou 

Service 

or(  )  Restore 

Failures/106  Hours 

Failures/106  Cycles  * 

* 

rs) 

Low 

Rec 

IISS 

Ref 

Low 

Rec 

15^1 

Ref 

Low 

Rec 

HU 

Kef 

ALL  MODES 

0.01 

0.63 

500 

■ 

0.72 

4.87 

50.0 

I 

1.0* 

41.50 

6.49E3 

■i 

CATASTROPHIC 

0.004 

0.25 

200.0 

0.29 

1.95 

20.0 

■ 

B 

Spurious  Opening 

0.002 

0.13 

100.0 

■ 

0.15 

0.98 

10.0 

| 

■ 

Spurious  Closing 

0.002 

0.12 

100.0 

■ 

0.14 

0.97 

10.0 

1 

1 

DEGRADED 

0.006 

0.38 

300.0 

0.43 

2.92 

30.0 

Partially  Opening 

S 

0.17 

1.17 

12.0 

■ 

Partially  Closing 

1 

0.26 

1.75 

18.0 

8 

8 

**One  Cycle  =  One  Demand 
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4.0  EVALUATING  THE  RELIABILITY  OF  SYSTEMS 

4.1  Concept  of  a  Point  Process 

Evaluating  the  reliability  of  systems  begins  with  the  realization  that  a  sequential 
failure  process  exists  for  the  system.  This  failure  process  is  composed  of  many 
sequential  random  variables. 

This  system  failure  process  is  depicted  in  Figure  4.1-1.  A  point  process  is 
characterized  by  observations  in  the  form  of  point  events  occurring  in  a  continuum 
such  as  time.  Such  processes  arise  in  many  fields  of  study  such  as  economics, 
physics  and  system  reliability.  A  point  process  can  be  defined  by  specifying: 

1)  description  of  each  event  and  the  measure  of  time  (e.g.,  operating  hours 
rounds,  cycles,  etc.) 

2)  the  observed  intervals  between  successive  events  denoted 
TBFi,  TBF2, ...  TBFn  or  the  instants  of  occurrence  of  the  events  measured 
from  the  time  origin  denoted  TTSF^  TTSF2,  TTSF3,  ...  TTSFn 


TTSFj  =  Failure  arrival  times  for  the  system 

TBFi  =  Interarrival  times  or  time  between  (successive)  failure 

FIGURE  4.1-1:  REPAIRABLE  SYSTEM  FAILURE  PROCESS 

The  observed  intervals  between  successive  events  (TBFj,  TBF2,  ...)  are  termed 
interarrival  times  and  the  intervals  to  the  occurrence  of  events  measured  from  the 
time  origin  (TTSFj,  TTSF2,  ...)  are  termed  arrival  times.  The  arrival  times  are 
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obtained  by  forming  the  cumulative  sums  of  the  interarrival  times  or 


TTSFi  =  TBF1;  TTSF2  =  TTSFj  +  TBF2/  TTSF3  =  TTSF2  +  TBF3/  ...  (4-1) 

TTSFn  =  TTSFn.!  +  TBFn 


where, 

TTSFn  =  is  the  arrival  time  of  the  nth  event 


Given  that  a  system  can  be  characterized  by  a  point  process,  a  major  concern  for 
the  reliability  analyst  lies  in  describing  this  detailed  pattern  of  occurrence.  Of 
particular  concern  is  whether  a  trend  or  some  other  systematic  feature  exists.  For 
example,  trends  indicating  that  the  interarrival  times  (TBFj)  are  becoming  smaller 
over  a  period  of  observation  indicates  that  system  performance  is  deteriorating.  The 
modeling  and  analysis  of  point  processes  provides  measures  to  quantify  such 
systems. 


Unlike  part  failure  data,  the  chronological  ordering  of  time-between-failure 
(TBF)  data  is  extremely  important  for  a  repairable  system.  Disrupting  or  failing  to 
track  this  ordering  of  failure  events  results  in  wasted  effort!  This  can  be  illustrated 
in  the  following  example,  given  the  following  three  time-between-failure  (TBF) 
values  of:  10,  50  and  100.  If  the  sequential  order  of  these  events  is  unknown,  then  a 
total  of  six  different  unique  system  processes  can  be  created.  The  total  number  of 
unique  system  processes  can  be  calculated  using  the  equation  for  permutations: 


P 


n 

r 


n! 

(n-r)! 


where, 

n  =  total  number  of  objects 

r  =  number  of  objects  selected  out  of  the  total  number 


(4-2) 


Substituting  n  =  3  and  r  =  3,  equation  (4-2)  yields: 


Pn 
1  r 


3!  =  6 

(3-3)!  “  1 


6 
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The  six  unique  system  processes,  identified  by  their  unique  arrangement  of 
interarrival  values,  are  as  follows: 

1)  10,  50,  100  (improving  trend) 

2)  10, 100,  50  (no  trend  established) 

3)  50, 10, 100  (no  trend  established) 

4)  50, 100, 10  (no  trend  established) 

5)  100,  50, 10  (deteriorating  trend) 

6)  100, 10,  50  (no  trend  established) 

If  order  statistics  and  distribution  plotting  techniques  could  be  applied  to  model 
each  system  process  (which  they  can  not),  the  same  distribution  parameters  would 
be  calculated  for  all  six  of  the  above  systems.  To  evaluate  one  unique  repairable 
system  point  process,  order  statistics  and  distribution  plotting  and  fitting  techniques 
cannot  be  applied.  If,  on  the  other  hand,  a  number  of  system  failure  processes  are 
available,  order  statistics  and  distribution  plotting  techniques  (as  discussed  for 
modeling  part  TTF  data)  can  be  used  to  evaluate  the  distribution  of  time-to-first- 
failure  (TTFF)  of  the  repairable  system.  This  also  holds  true  for  any  other  unique 
interarrival  time  (such  as  time  between  first  and  second  failure).  The  appropriate 
system  modeling  tools  will  be  presented  and  discussed  in  this  section. 

4.2  Point  Process  Models 

When  modeling  a  single  repairable  system  point  process,  the  two  most  popular 
models  that  have  been  publicized  are  the: 

•  Homogeneous  Poisson  Process  (HPP) 

•  Nonhomogeneous  Poisson  Process  (NHPP) 

The  HPP  model  can  be  used  to  describe  a  process  which  is  stationary  and  whose 
time  between  failures  show  no  trends  to  increase  or  decrease  as  the  system  ages. 
This  type  of  repairable  system  is  characterized  by  a  constant  rate  of  occurrence  of 
failure  (ROCOF).  This  constant  rate  is  also  called  the  peril  rate,  p. 
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The  NHPP  model  can  be  used  to  describe  a  process  whose  time  between  failures 
show  trends  to  increase  or  decrease  as  the  system  ages.  The  NHPP  is  a  good  first 
approximation  for  a  repairable  system  because  it  models  a  process  characterized  by  a 
time  dependent  rate  of  occurrence  of  failure  or  p(t). 

The  procedure  for  selecting  which  process  model  should  be  applied  is  provided 
in  Figure  4.2-1.  A  similar  procedure  has  been  recommended  by  Asher  in  Reference 
[55]. 


FIGURE  4.2-1:  SELECTING  THE  APPROPRIATE  PROCESS  MODEL 


Both  process  modeling  and  trend  analysis  procedures  (HPP,  NHPP)  will  be 
discussed  in  Sections  4.2  and  4.3. 
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4.2.1  Homogeneous  Poisson  Process  (HPP) 


The  Homogeneous  Poisson  Process  can  be  used  to  model  a  system  failure  process 
whose  time  between  failures  (TBFi)  are  independent  and  identically  exponentially 
distributed.  The  interarrival  values  of  the  point  process  (TBFi)  must  exhibit  no 
trend  to  increase  or  decrease  as  the  system  ages.  Interarrival  values  possessing  this 
characteristic  are  referred  to  as  "random"  interarrival  values.  A  system  that  is 
neither  improving  nor  deteriorating  (i.e.,  constant  rate  of  occurrence  of  failure)  is 
generally  a  good  candidate  for  the  HPP  model. 


The  Poisson  Process  is  characterized  by  the  number  of  failures  in  any  interval 
from  tj  to  t2  having  a  Poisson  distribution  with  mean  p(t2  -  tj).  The  Poisson 

process  can  be  characterized  as: 


P  {N(t2)  -  N(t,)  =  j}  = 


e-fK‘2  ‘i){p(t2  -  tj)}’ 

ji  ,J 


(4-3) 


where, 

N(t)  represents  the  number  of  failures  to  time  t  and  p  is  the  constant  rate  of 
occurrence  of  failure.  Equation  (4-3)  states  the  probability  of  having  "j"  failures 
in  the  interval  t|  to  t2  for  a  homogeneous  Poisson  process. 

By  setting  j  =  0,  the  probability  of  no  failure  in  the  interval  ti  to  t2  can  be 
determined  as: 

P{N(t2)  -  N(ta)  =  0}  =  e'^2'^  <4-4) 

Equation  (4-4)  represents  the  probability  of  survival,  or  reliability,  in  the  interval 
t1  to  t2  which  can  be  represented  as: 

R(ti,t2)  =  eP(,2-,l)  («) 
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4.2.2  Nonhomogeneous  Poisson  Process  (NHPP) 


A  functional  form  of  time  variant  rate  of  occurrence  of  failure  (ROCOF),  p(t),  for 
the  NHPP  is: 

p(t)  =  XptP*1  X,  p  >  0,  t  >  0  (4-6) 

Given  a  system  failure  process  which  contains  a  trend,  the  ROCOF  or  p(t),  can  be 
determined  by  maximum  likelihood  estimators  of  p  and  X.  The  maximum 
likelihood  estimators  of  P  and  X  as  shown  by  Crow5  are: 


P  = 


n 


n-l 
X  In 


i=l 


TTSFn 

TTSFj 


A 

X  = 


n 


TispP 


where. 


(4-7  A) 


(4-7B) 


TTSFj  =  Arrival  times  as  identified  in  Figure  4.1-1 
n  =  Total  number  of  system  failure  events 


FIGURE  4.2-2:  REPAIRABLE  SYSTEM  FAILURE  PROCESS,  EXAMPLE 


5  Crow,  L.H.,  "Reliability  Analysis  for  Complex  Repairable  Systems,"  Reliability  and  Biometry,  F. 
Proschan  and  R.J.  Serfling,  eds.,  SIAM,  Philadelphia,  pp.  379-410,  1974. 
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Using  Equation  (4-7 A),  calculate  p: 


A 

P  = 


,  515  ,  515  .  515  ,  515 

In  — —  +  In  —  +  In  —  +  In 


=  .65 


10 


60 


160 


335 


A 

Using  Equation  (4-7B),  calculate  X: 


A 


X  = 


5 

515'65 


5 

57.9 


=  .086 


Substitution  of  P  and  X  into  Equation  (4-6)  yields: 
p(t)  =  (.086)(.65)f65 ' 1 


p(t)  =  .056  f35 

The  expected  number  of  failures  in  the  interval  zero  to  t,  V(t),  is  given  by  the 
following: 

V(t)  =  jp(t)dt  (4"8A) 

Substituting  Equation  (4-6)  into  Equation  (4-8A)  yields: 

V(t)  =  xft  (4"8B) 


Using  our  example,  the  expected  number  of  failures  after  300  hours  is: 

V(300)  =  (.086)30065  =3.5 

This  value  of  3.5  failures  corresponds  with  the  expected  number  of  failures  after 
300  hours  for  the  system  shown  in  Figure  4.2-2. 
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4.3  Trend  Analysis  of  System  Failure  Data 

In  this  section  we  will  present  two  procedures  for  evaluating  if  a  trend  exists  in  a 
system  failure  process.  The  two  procedures  for  evaluating  trends  are: 

a)  Graphical  Plot  of  cumulative  failures  versus  Cumulative  Operating  Time 
Using  Linear  Scales 

b)  Laplace  test  statistic 

Each  of  these  trend  analysis  procedures  is  easy  to  apply  and  interpret. 

As  indicated  in  Table  4.3-1,  the  determination  whether  or  not  a  trend  (i.e., 
increasing  or  decreasing  TBFj)  exists  is  essential  in  selecting  the  appropriate  model 
for  the  process. 

4.3.1  Plotting  Cumulative  Failures  vs.  Cumulative  Operating  Time 
Let  us  now  consider  the  two  system  failure  processes  defined  in  Table  4.3-1. 


TABLE  4.3-1:  SYSTEM  FAILURE  PROCESS  DATA 


Failure  Order 
Number  (i) 

System  A  Arrival 
Times  (TTSFi) 

System  B  Arrival 
Times  (TTSFi) 

1 

15 

177 

2 

42 

242 

3 

74 

293 

4 

117 

336 

5 

168 

368 

6 

233 

395 

7 

410 

410 

The  data  for  system  A  was  intentionally  fabricated  to  represent  an  increasing 
trend  in  the  time  between  failures  (TBFj)  which  are: 

A:  15,27,32,43,51,65,177 
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The  data  for  system  B  was  intentionally  fabricated  to  represent  a  decreasing  trend 
in  time  between  failures  (TBFj)  which  are: 

B:  177,65,51,43,32,27,15 

Both  of  these  systems  (A,  B)  can  be  evaluated  by  constructing  a  plot  of 
cumulative  failures  versus  cumulative  test  time  on  linear  scales  as  shown  in  Figure 
4.3-1.  The  data  from  Table  4.3-1  was  used  to  generate  each  curve. 


FIGURE  4.3-1:  CUMULATIVE  FAILURES  VS.  CUMULATIVE  OPERATING  TIME 
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Using  Figure  4.3-1  as  a  visual  reference,  we  can  conclude  that  failure  processes 
(such  as  System  A)  which  exhibit  a  convex  curve  on  a  plot  of  cumulative  failures 
versus  cumulative  operating  time  using  linear  scales  represent  improving  systems 
(i.e.,  TBFj  are  tending  to  increase).  Failure  processes  (such  as  System  B)  which 
exhibit  a  concave  curve  on  a  plot  of  cumulative  failures  versus  cumulative 
operating  time  represent  deteriorating  systems  (i.e.,  TBFj  are  tending  to  decrease). 

This  graphical  technique  provides  a  simple  but  effective  means  to  visually  assess 
whether  or  not  a  trend  exists  in  a  system  failure  process  and  can  be  applied  prior  to 
modeling  using  the  HPP  or  NHPP. 

4.3.2  Laplace  Test  Statistic 

Pierce  Simon  Laplace  (1749  -  1827)  was  one  of  the  great  mathematicians  of  the 
eighteenth  century  and  was  responsible  for  many  of  the  statistical  theorems  which 
are  still  in  use  today  -  one  being  the  central  limit  theorem  and  another  being  the 
much  less  known  Laplace  statistic.  Here  we  adopt  his  principle  to  evaluate  whether 
or  not  a  trend  is  present  for  failure  events  of  a  system. 

As  with  the  graphical  method  discussed  in  Section  4.3.1,  the  Laplace  test  statistic 
can  also  be  used  to  determine  if  sequential  interarrival  times  (TBFj)  are  tending  to 

increase,  decrease  or  remain  the  same.  The  Laplace  test  statistic  for  a  process  with 
"n"  failures  is: 


f  YrrsFj  |/(n-i) 

-  (TTSFn  /  2) 

iii-i  ) 

TTSFn  ^1/  (l2(n-l)) 


The  conclusions  which  can  be  rendered  based  the  Laplace  statistic,  U,  are: 

a)  U  approximately  equal  to  zero  indicates  the  lack  of  trend 

b)  U  greater  than  zero  indicates  interarrival  values  (TBFj)  are  tending  to 
decrease  (i.e.,  system  deterioration) 
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c)  U  less  than  zero  indicates  interarrival  values  (TBFj)  are  tending  to  increase 
(i.e.,  system  improvement) 

If  we  again  utilize  the  system  failure  process  definitions  of  Figure  4.3-1  and 
calculate  the  Laplace  statistic  for  System  A  then  B: 

System  A: 

Given:  n  =  7 

TTSFn  =  410 
n-1 

t  TTSFi  =  646 
i=l 

Calculate  the  Laplace  statistic  using  Equation  (4-9): 

(646/6)  -  (410/2) 

410  VTT72 

U  =  -2.01  (System  TBFj  tends  to  increase) 

System  B: 

Given:  n  =  7 
TTSFn  =  410 


n-1 

X  TTSFi  =  1811 
i=l 


Calculate  the  Laplace  statistic  using  Equation  (4-9): 


_  (1811/6)  -  (410/2) 
410  VT 772 


U  =  +2.00  (System  TBFi  tends  to  decrease) 
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5.0  MONTE  CARLO  SIMULATION 

The  term  "simulation"  refers  to  a  numerical  technique  for  conducting 
experiments  on  a  digital  computer  which  involves  evaluating  certain  types  of 
mathematical  and  logical  models  that  describe  system  behavior.  One  particular  type 
of  simulation  is  stochastic  simulation,  which  involves  experimenting  with  a  model 
over  time  and  includes  sampling  random  variables  from  probability  distributions. 
Stochastic  simulation  is  often  termed  the  Monte  Carlo  method  and  uses  pseudo¬ 
random  numbers  for  the  solution  of  a  model.  The  pseudo-random  numbers  are 
generated  by  recursive  algorithms.  Simulation  models  are  useful  where  closed- 
form  mathematical  solutions  are  impossible  or  very  time-consuming. 

The  Monte  Carlo  method  provides  a  simulation  technique  for  predicting  system 
performance  information  from  part  reliability  characteristics.  This  method  has  been 
applied  in  a  variety  of  ways  to  predict  mechanical  system  reliability.  Simulation 
offers  a  numerical  approach  for  evaluating  system  reliability  where  few  restrictions 
are  placed  on  the  complexity  of  the  model(s). 

The  Monte  Carlo  method  consists  of  repeated  numerical  random  sampling  of  a 
given  model.  The  process  is  essentially  "synthetic  experimentation"  where  many 
systems  are  built  by  computer  calculations  and  the  system  performances  are 
evaluated  and  summarized  to  obtain  system  reliability.  A  flow  chart  illustrating  a 
typical  process  of  Monte  Carlo  simulation  is  shown  in  Figure  5.0-1. 

Two  cases  involving  the  application  of  Monte  Carlo  simulation  for  determining 
mechanical  system  reliability  are  illustrated.  For  Case  One  (1),  the  system  reliability 
block  diagram  and  the  part  reliabilities  are  known.  For  Case  Two  (2),  the  system 
reliability  block  diagram  is  known  along  with  the  stress  -  strength  distributions  for 
each  part. 
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FIGURE  5.0-1:  FLOW  CHART  OF  MONTE  CARLO  SIMULATION  METHOD  [10] 


Case  1 


The  following  procedure  can  be  applied  to  determine  system  reliability  using  the 
Monte  Carlo  method  when  the  system  reliability  block  diagram  and  part  reliabilities 
are  known.  It  is  assumed  that  the  reliabilities  are  determined  from  an  identical 
constant  value  of  cumulative  operating  time.  Other  times  can  be  considered  by 
iterating  Steps  1-5. 

Step  1:  Given  N  actual  part  reliabilities  R^  (i  =  1,  2,  ...,  N)  and  the  system 
reliability  block  diagram. 

Step  2:  Generate  N  independent  uniform  variates  (random  numbers)  in  the 

n 

interval  0,  1  and  designate  them  the  required  part  reliabilities  Rj  (i  =  1, 
2,...,N). 
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Step  3:  Compare  the  required  part  reliabilities  to  the  actual  part  reliabilities 
and  consider  the  i—  part  a  failure  when  Rf  >  (i  =  1, 2, 3, ...,  N) 

Step  4:  Determine  system  success  or  failure  using  the  system  reliability  block 
diagram  and  the  information  from  Step  3. 

Step  5:  Steps  2  to  4  are  repeated  many  times  and  a  system  failure  or  success  is 
recorded  each  time.  The  system  reliability  is  then  estimated  as: 

^  _  Number  of  system  successes 

sys  ”  Total  number  of  Monte  Carlo  trials 


As  an  example,  the  procedure  just  described  in  Steps  1-  5  is  applied  to  a 
mechanical  system  whose  reliability  block  diagram  is  shown  in  Figure  5.0-2.  A 
simple  system  was  used  as  an  illustration  of  the  procedure.  The  practical  application 
of  the  Monte  Carlo  method  involves  more  complex  systems  and  the  use  of  high¬ 
speed  digital  computers  to  generate  many  simulated  trails. 


Stage  B 


FIGURE  5.0-2:  SYSTEM  RELIABILITY  BLOCK  DIAGRAM 


Seven  iterations  of  Steps  2-4  were  performed  to  illustrate  this  method,  and  the 
results  are  presented  in  Table  5.0-1.  In  actual  application,  many  trials  (i.e.,  greater 
than  1,000)  would  be  simulated.  A  table  of  random  values  from  the  uniform 
distribution  was  used  to  generate  part  reliabilities.  The  system  reliability  calculated 
from  the  results  in  Table  5.0-1,  Column  9  and  Equation  (5-1)  yields: 


^  _  Number  of  system  successes 

sys  -  Total  number  of  Monte  Carlo  trials 


5/7 


RSyS  =  .71  (From  only  seven  Monte  Carlo  trials) 
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TABLE  5.0-1:  SEVEN  MONTE  CARLO  TRIALS  TO  ESTIMATE 
THE  SYSTEM  RELIABILITY  OF  FAILURE  5.0-2 


1 

Random 

Value 

Generated 

For  Ri 

2 

Result  For 
Stage  A 

3  4 

Random  Value 
Generated  For  Block 

5 

Result  For 
Stage  B 

6  7 

Random  Value 
Generated  For  Block 

8 

Result  For 
Stage  C 

9 

System 

Result* 

R2 

R3 

r4 

Rs 

.01 

Success 

.28 

.51 

Success 

.39 

.10 

Success 

Success 

.38 

Success 

.37 

.63 

Success 

.90 

.49 

Failure 

Failure 

.08 

Success 

.48 

.35 

Success 

.25 

.23 

Success 

Success 

.99 

Failure 

.02 

.50 

Success 

.75 

.03 

Success 

Failure 

.13 

Success 

.34 

.44 

Success 

.05 

.25 

Success 

Success 

.66 

Success 

.48 

.07 

Success 

.52 

.17 

Success 

Success 

.31 

Success 

.80 

.60 

Success 

.56 

.13 

Success 

Success 

*System  Success  =  Successful  Stage  A  and  Stage  B  and  Stage  C 
Case  2 

The  following  procedure  can  be  applied  to  determine  system  reliability  using 
Monte  Carlo  simulation  when  the  system  reliability  block  diagram  and  the  statistical 
distributions  of  stress-strength  for  each  part  are  known. 

Step  1:  Given  N  parts  in  the  reliability  block  diagram  and  the  stress  (sj )  and 
Strength  (Sj)  distributions  for  each  part  (i  =  1,  2,  3, ...,  N). 

Step  2:  For  s,  and  Sj  (i  =  1,  2,  3,  ...,  N)  determine  the  proper  equation  for 
calculating  the  random  stresses  ^xs.  j  and  random  strengths  |xg. )  from 
Table  5.0-2. 

t  t 

Step  3:  Calculate  xs.  and  xg.  (i  =  1,  2,  3, ...,  N)  which  requires  the  generation  of 
random  numbers  from  either  the  random  standard  normal 
distribution  (R^j)  or  the  random  standard  uniform  distribution  (Rjj), 
depending  on  which  is  required. 

Step  4:  Compare  the  random  stresses  (xs.  j  and  random  strengths  (xg.  j  and 

fh 

consider  the  i—  part  a  failure  when: 

t  t 

xs.  >  xg.  (i  =  1,  2,  3,  ...,  N)  or  stress  greater  than  strength 
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TABLE  5.0-2:  GENERATION  OF  RANDOM  VALUES  FROM  VARIOUS 
DISTRIBUTIONS  GIVEN  RANDOM  STANDARD  NORMAL  (Rn)  AND 
RANDOM  STANDARD  UNIFORM  (Ry)  VALUES* 


Distribution  to 
be  Simulated 

Probability  Density  Function 

Procedure  to  Obtain  Random 
Value  x' 

Exponential 

f(x)  =  Ae'** 

x  =  -  —  lnRn 

A  U 

Weibull 

f(x)  =  -L  x13'1  exp 
or 

[(;!] 

•  ,  J 

x  =  a  (-  In  R,j  j 

Normal 

1 

f(x)  —  i —  oxp 

a  V2tc 

-(x  -  p)2 
2a2 

x  =  |i  +  aRN 

Log-Normal 

1 

f(x)  -  , —  exp 

ax  V2jt 

(lnx  -  ja)2 

2a2  J 

x’  =  eJl  +  oRN 

Gamma 

a-l  -x/p 
f(x)  =  - - - - 

p“r(a) 

x  =  -pin  IIRu. 

i=l  ' 

Note:  *  Rn  is  random  value  from  normal  distribution  with  p  =  0,  a  =  1.  Ry  is  random  value  from 

uniform  distribution  over  interval  (0,  1).  When  more  than  one  value  is  required,  a  typical 
value  is  designed  as  RNj  or  RUi .  All  values  are  taken  independently  of  one  another. 


Step  5:  Determine  system  success  or  failure  using  the  system  reliability  block 
diagram  and  information  from  Step  4 

Step  6:  Steps  3  to  5  are  repeated  many  times  and  a  system  failure  or  success  is 
recorded  each  time.  The  system  reliability  is  then  estimated  as: 

^  _  Number  of  system  successes 

sys  -  Total  number  of  Monte  Carlo  trials 

As  the  complexity  of  a  system  increases  and  analytical  approaches  become  too 
consuming  in  terms  of  manhours,  simulation  procedures  structured  for  computer 
application  become  more  advantageous  to  predict  system  reliability.  Recent 
advances  in  simulation  methodologies  and  available  software  have  made 
simulation  one  of  the  most  widely  used  and  accepted  tools  in  system  analysis. 
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6.0  FAILURE  MODE  EVALUATION  TECHNIQUES  FOR  SYSTEMS 


Two  systematic  methods  for  evaluating  the  consequence  of  failure  within  a 
system  are  presented  in  this  section.  They  are: 

•  Failure  Mode,  Effects  and  Criticality  Analysis  (FMECA) 

•  Fault  Tree  Analysis  (FT A) 


Each  of  these  evaluation  tools  has  been  well  documented  in  numerous 
references  such  as:  Fault  Tree  Analysis  Application  Guide  (Reference  [73]),  Fault 
Tree  Handbook  (Reference  [85])  and  MIL-STD-1629A  (Reference  [78]).  The  intent  of 
the  material  contained  in  this  section  is  to  provide  an  overview  of  the  procedures 
required  to  perform  each  analysis  and  relate  their  use  to  mechanical  applications. 


The  approach  utilized  in  the  FMECA  is  different  from  that  taken  by  the  FTA. 
The  FMECA  is  a  "bottom  up"  approach  while  the  FTA  is  a  "top  down"  approach. 
The  reason  for  each  of  these  descriptions  is  illustrated  in  Figure  6.0-1. 

FTA  Approach: 

Start  with:  End  with: 


FMECA  Approach: 


Start  with: 


End  with: 


FIGURE  6.0-1:  COMPARING  THE  APPROACH  OF  FMECA  AND  FTA 
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The  approach /procedures  for  performing  each  of  these  analyses  will  be  discussed 
in  further  detail  in  Section  6.1  and  6.2. 


Table  6.0-1  is  provided  to  assist  the  analyst  in  determining  which  analysis, 
FMECA  or  FTA,  is  preferred  based  on  the  individual  circumstances.  These 
recommendations  are  made  based  on  the  abilities  of  each  approach  to  best  satisfy  the 
stated  circumstance  or  requirement.  For  example,  the  FTA  has  the  ability  to  handle 
failures  other  than  hardware  failures,  therefore,  FTA  would  be  preferred  over 
FMECA  if  nonhardware  failures  require  consideration. 

TABLE  6.0-1:  FTA  VS.  FMECA  SELECTION  CRITERIA 


Circumstance 

Fault  Tree 
Preferred 

FMECA 

Preferred 

Safety  of  the  general  public  or  operating  and  maintenance  personnel  is 
the  primary  concern 

X 

A  small  number  of  dearly  differentiated  Top  Events’  are  explicitly 
defined 

X 

Inability  to  clearly  define  a  small  number  of  Top  Events' 

X 

Completion  of  the  Entire  Mission  is  of  critical  importance 

X 

Any  number  of  potentially  successful  missions  are  possible 

X 

All  possible  failure  modes  are  of  concern 

X 

There  is  a  high  potential  for  "Human  Error"  contributions 

X 

There  is  a  high  potential  for  "Software  Error"  contributions 

X 

A  numerical  "Risk  Evaluation"  is  the  primary  concern 

X 

The  system  architecture  is  highly  complex  and/or  it  contains  highly 
interconnected  functional  paths 

X 

The  system  is  basically  of  a  linear  architecture  with  little  human  or 
software  intervention 

X 

The  system  is  not  repairable  once  the  mission  commences 

X 

Require  lowest  cost  analysis 

X 

Require  the  most  timely  analysis 

x 

X 

6.1  Failure  Mode,  Effects  and  Criticality  Analysis  (FMECA) 


The  Failure  Mode,  Effects  and  Criticality  Analysis  (FMECA)  is  a  systematic  design 
evaluation  procedure  whose  purpose  is  to  identify  potential  failure  modes  and  to 
assess  their  effects  throughout  the  system  and  to  define  failure  mode  criticality 
which  provides  quantitative  assessment  of  each  failure  mode  based  on  frequency 
and  consequence.  FMECA  has  become  one  of  the  most  effective  system  design 
analysis  techniques  used  in  reliability  engineering.  A  properly  performed  FMECA  is 
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used  to  support  a  wide  range  of  engineering  activities  such  as  design  and 
maintenance. 

The  FMECA  provides: 

1)  The  design  engineer  with  a  method  of  selecting  a  design  with  a  high 
probability  of  operational  success. 

2)  Design  engineering  with  a  uniform  method  for  assessing  failure  modes  and 
their  effects  on  operational  success  of  the  system. 

3)  Early  visibility  of  system  problems. 

4)  A  list  of  possible  failures  which  can  be  ranked  according  to  their  category  of 
effects  and  probability  of  occurrence. 

5)  Identification  of  single  failure  points  critical  to  success. 

6)  Early  criteria  for  test  planning. 

7)  A  basis  for  design  and  location  of  performance  monitoring  and  fault  sensing 
devices  and  other  built-in  automatic  test  equipment. 

8)  A  tool  which  serves  as  an  aid  in  the  evaluation  of  proposed  design, 
operational,  or  procedural  changes  and  their  impact  on  success. 

FMECA  utilizes  inductive  logic  in  a  "bottom  up"  approach.  Beginning  at  the 
lowest  level  of  the  system  hierarchy  (e.g.,  part)  and  from  a  knowledge  of  the  failure 
modes  of  each  part,  the  analyst  traces  up  through  the  system  hierarchy  to  determine 
the  effect  that  each  failure  mode  will  have  on  system  performance.  This  differs 
from  fault  tree  analysis  which  utilizes  deductive  logic  in  a  "top  down"  approach.  In 
fault  tree  analysis,  the  analyst  assumes  a  system  failure  and  traces  down  through  the 
system  hierarchy  to  determine  the  event  or  series  of  events  that  could  cause  such  a 
failure. 

Analysts  performing  an  FMECA  must  first  acquire  a  full  understanding  of  the 
design  and  its  operation.  It  is  important  to  understand  the  function  of  each  part  and 
how  it  interfaces  with  other  parts.  Toward  this  end,  any  information  available  on 
the  design  must  be  obtained.  A  typical  list  of  preferred  documentation  may  include: 
specifications,  drawings,  stress  analyses,  test  results,  reliability  predictions,  bill  of 
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materials,  theory  of  operations,  operating  manuals,  etc.  Once  an  understanding  of 
how  a  system  works  is  acquired,  then  the  focus  can  shift  to  determine  how  it  can 
fail. 


The  FMECA  is  a  combination  of  two  analysis  procedures  which  are: 

1)  Failure  Mode  and  Effects  Analysis  (FMEA) 

2)  Criticality  Analysis  (CA) 

The  FMEA  is  an  analysis  procedure  which  identifies  each  potential  failure  mode 
in  a  system.  Each  failure  mode  is  then  assessed  in  terms  of  its  effects  at  the  local, 
next  higher  assembly  and  system  levels.  Finally,  each  failure  mode  is  given  a 
severity  classification  according  to  the  system  level  effects.  The  initial  FMEA  should 
be  done  early  in  the  conceptual  phase  when  design  criteria,  mission  requirements, 
and  preliminary  designs  are  being  developed  to  evaluate  the  design  approach  and  to 
compare  the  benefits  of  competing  design  configurations. 


The  FMEA  will  provide  quick  visibility  of  the  most  obvious  failure  modes  and 
identify  potential  single  failure  points,  some  of  which  can  be  eliminated  with 
minimal  design  modifications.  As  the  design  definitions  become  more  refined,  the 
FMEA  can  be  expanded  to  successively  more  detailed  levels.  When  changes  are 
made  in  system  design  to  remove  or  reduce  the  impact  of  critical  failure  modes,  the 
FMEA  should  be  updated  to  ensure  that  all  predictable  failure  modes  in  the  new 
design  are  considered. 

The  analysis  approach  to  be  used  for  the  FMEA  will  generally  be  dictated  by 
variations  in  design  complexity  and  the  available  data.  There  are  two  primary 
approaches  to  accomplish  an  FMEA. 

•  Functional  FMEA  Approach  -  The  functional  approach  is  normally  used 
when  hardware  items  cannot  be  uniquely  identified.  Each  identified  failure 
mode  is  assigned  a  severity  classification  which  can  be  utilized  during 
design  iterations  to  establish  priorities  for  corrective  actions.  The  functional 
FMEA  should  commence  after  the  design  process  has  delivered  a  functional 
block  diagram  of  the  system  but  has  not  yet  identified  a  specific  hardware 
implementation.  It  is  the  first  FMEA  to  be  performed  and  should  be 
updated  throughout  the  design  iteration  process  or  as  corrective  actions  are 
implemented. 
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•  Hardware  FMEA  Approach  -  The  hardware  approach  is  normally  used 
when  hardware  items  can  be  uniquely  identified  from  schematics,  drawings, 
and  other  engineering  and  design  data.  The  hardware  approach  is  normally 
utilized  in  a  part  level  up  fashion.  Each  identified  failure  mode  is  assigned 
a  severity  classification  which  will  be  utilized  during  design  to  establish 
priorities  for  corrective  actions.  The  hardware  FMEA  should  commence 
after  the  design  process  has  delivered  a  schematic  diagram  with  all  system 
items  or  parts  defined. 


For  complex  systems,  a  combination  of  the  functional  and  hardware  approaches 
may  be  considered.  The  FMEA  may  be  performed  as  a  hardware  analysis,  a 
functional  analysis,  or  a  combination  analysis  and  is  ideally  initiated  at  the  part, 
circuit  or  functional  level  and  proceeds  through  increasing  indenture  levels  until 
the  FMEA  for  the  system  is  complete. 

An  optimum  set  of  information  has  been  assembled  for  performing  the  FMEA. 
This  information  is  typically  arranged  in  the  format  shown  in  Figure  6.1-1.  Figure 
6.1-1  represents  a  typical  FMECA  worksheet. 


FAILURE  MODE  AND  EFFECTS  ANALYSIS 

SYSTEM  DATE 

INDENTURE  LEVEL  SHEET  .  ..  OF 

REFERENCE  DR  AWING  ...  COMPILED  BY 

MISSION  APPROVED  BY 

Failure 

Effects 

ID 

Number 

Item/Functional 

Identification 

(N  omenclature) 

Function 

Failure 

Modes  and 

Causes 

Mission  Phase/ 
Operational  Mode 

Local 

Effects 

Next  Higher 

Level 

End 

Effects 

Failure 

Detection 

Method 

Compen¬ 

sating 

Provisions 

FIGURE  6.1-1:  FMEA  WORKSHEET  FORMAT 
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Each  of  the  significant  data  elements  shown  in  Figure  6.1-1  is  defined  as  follows: 

•  Identification  number.  A  serial  number  or  other  reference  designation  is 
assigned  for  traceability  purposes. 

•  Item /functional  identification.  The  name  or  nomenclature  of  the  item  or 
system  function,  being  analyzed  for  failure  mode  and  effects,  is  listed. 

•  Function.  A  concise  statement  of  the  function  performed  by  the  hardware 
item  are  listed.  This  includes  both  the  inherent  function  of  the  part  and  its 
relationship  to  interfacing  items. 

•  Failure  modes  and  causes.  All  predictable  failure  modes  for  each  item 
analyzed  shall  be  identified  and  described.  Potential  failure  modes  are 
determined  by  examination  of  item  outputs  and  functional  outputs 
identified  in  applicable  block  diagrams  and  schematics.  Failure  modes  of 
the  individual  item  function  are  postulated  on  the  basis  of  the  stated 
requirements  in  the  system  definition  narrative  and  the  failure  definitions 
included  in  the  ground  rules.  The  most  probable  causes  associated  with  the 
postulated  failure  mode  are  identified  and  described.  Since  a  failure  mode 
may  have  more  than  one  cause,  all  probable  independent  causes  for  each 
failure  mode  are  identified  and  described. 

•  Mission  phase /operational  mode.  A  concise  statement  of  the  mission  phase 
and  operational  mode  in  which  the  failure  occurs.  Where  subphase,  event, 
or  time  can  be  defined  from  the  system  definition  and  mission  profiles,  the 
most  definitive  timing  information  should  also  be  entered  for  the  assumed 
time  of  failure  occurrence. 

•  Local  or  Primary  Effects.  Local  effects  concentrate  specifically  on  the  impact 
an  assumed  failure  mode  has  on  the  operation  and  function  of  the  item  in 
the  indenture  level  under  consideration.  The  consequences  of  each 
postulated  failure  affecting  the  item  shall  be  described  along  with  any 
second-order  effects  which  result.  The  purpose  of  defining  local  effects  is  to 
provide  a  basis  for  evaluating  compensating  provisions  and  for 
recommending  corrective  actions.  It  is  possible  for  the  "local"  effect  to  be 
the  failure  mode  itself. 

•  Next  Higher  Level  or  Secondary  Effects.  Next  higher  level  effects 
concentrate  on  the  impact  an  assumed  failure  has  on  the  operation  and 
function  of  the  items  in  the  next  higher  indenture  level  above  the 
indenture  level  under  consideration.  The  consequences  of  each  postulated 
failure  affecting  the  next  higher  indenture  level  shall  be  described.  If 
analyzing  a  seal  in  a  pump,  the  effect  that  the  failed  seal  has  on  the  pumps 
function  would  be  described  at  this  level. 
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•  End  or  System  Effects.  End  effects  evaluate  and  define  the  total  effect  an 
assumed  failure  has  on  the  operation,  function,  or  status  of  the  uppermost 
system.  The  end  effect  described  may  be  the  result  of  a  multiple  failure.  For 
example,  failure  of  a  safety  device  may  result  in  a  catastrophic  end  effect 
only  in  the  event  that  both  the  prime  function  goes  beyond  the  limit  for 
which  the  safety  device  is  set  and  the  safety  device  fails. 

•  Failure  Detection  Method.  Describe  the  methods  by  which  occurrence  of  a 
failure  mode  is  detected  by  the  operator  or  maintenance  personnel.  The 
failure  detection  means,  such  as  visual  or  audible  warning  devices, 
automatic  sensing  devices,  sensing  instrumentation,  other  unique 
indications,  or  none,  should  also  be  identified  here. 

•  Failure  Compensation  Method.  Identify  corrective  design  or  other  actions 
required  to  eliminate  the  failure  or  control  the  risk.  This  step  is  required  to 
record  the  true  behavior  of  the  item  in  the  presence  of  a  failure.  The  analyst 
should  describe  design  compensating  provisions  that  will:  (1)  nullify  the 
effects  of  a  failure,  (2)  control  or  deactivate  system  items  to  halt  generation 
or  propagation  of  failure  effects,  or  (3)  activate  backup  or  standby  items  or 
systems.  Design  compensating  provisions  can  include  redundant  items  that 
allow  continued  and  safe  operation,  safety  or  relief  devices  such  as 
monitoring  or  alarm  provisions  which  permit  effective  operation  or  limit 
damage  and  alternative  modes  of  operation  such  as  backup  or  standby  items 
or  systems. 

•  Severity  Classification  -  Each  failure  mode  should  be  evaluated  in  terms  of 
the  worst  potential  consequences  which  may  result.  A  code  will  be  assigned 
describing  the  worst  possible  incidence  of  this  failure.  This  code  is  the 
severity  classification  code.  Severity  classifications  are  assigned  to  provide  a 
qualitative  measure  of  the  worst  potential  consequences  resulting  from 
design  error  or  item  failure.  A  severity  classification  is  assigned  to  each 
identified  failure  mode  and  each  item  analyzed.  Severity  classification 
categories  which  are  consistent  with  various  military  standards  are  defined 
as  follows: 

•  Category  I  -  Catastrophic:  A  failure  which  may  cause  death  or  system 
loss. 

•  Category  II  -  Critical:  A  failure  which  may  cause  severe  injury,  major 
property  damage,  or  major  system  damage  which  will  result  in  mission 
loss. 

•  Category  III  -  Marginal:  A  failure  which  may  cause  minor  injury,  minor 
property  damage,  or  minor  system  damage  which  will  result  in  delay  or 
loss  of  availability  or  mission  degradation. 
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•  Category  IV  -  Minor:  A  failure  not  serious  enough  to  cause  injury, 
property  damage,  or  system  damage,  but  which  will  result  in 
unscheduled  maintenance  or  repair. 

Where  it  may  not  be  possible  to  identify  an  item  or  a  failure  mode  according 
to  the  loss  statements  in  the  four  categories  above,  similar  loss  statements 
based  upon  loss  of  system  inputs  or  outputs  can  be  developed. 

Many  experienced  analysts  may  choose  to  customize  the  severity  classifications 
and  derive  a  set  of  classifications  which  are  more  explanatory  for  a  particular  system. 
For  example,  the  following  severity  classifications  were  used  for  a  missile  fuze 
FMEA: 

1)  No  effect  on  mission 

2)  Fuze  does  not  detonate  after  launch 

3)  Fuze  detonates  too  high  after  electrical  arming 

4)  Fuze  detonates  too  low  after  electrical  arming 

5)  Fuze  detonates  after  launch  but  before  electrical  arming 

6)  Fuze  detonates  before  launch 

Next  we  will  consider  the  criticality  analysis  (CA).  The  CA  is  an  analysis 
procedure  for  associating  criticality  numerics  with  each  failure  mode.  The  CA 
complements  the  FMEA  and  is  dependent  upon  information  developed  in  that 
analysis.  The  CA  is  typically  performed  in  conjunction  with  or  following  the 
FMEA.  The  CA  is  a  valuable  tool  for  maintenance  and  logistic  support  since  failure 
modes  which  have  a  high  probability  of  occurrence  (high  criticality  numbers)  and  a 
significant  consequence  can  be  identified  and  assessed  in  terms  of  potential  impact 
on  the  requirements  for  the  system. 

The  analysis  approach  to  be  used  for  the  CA  will  generally  be  dictated  by  the 
availability  of  failure  rate  data.  There  are  two  approaches  for  accomplishing  the  CA. 
One  is  the  qualitative  approach  which  is  appropriate  only  when  failure  rate  data  is 
not  available.  The  preferred  method  is  the  quantitative  approach  which  is  utilized 
when  failure  rate  data  is  available. 
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A  criticality  analysis  worksheet  is  shown  in  Figure  6.1-2.  The  criticality  analysis 
complements  the  FMEA  with  additional  data  elements  which  are  indicated  by 
asterisks  in  Figure  6.1-2. 

The  following  describe  the  data  elements  necessary  to  perform  either  a 
qualitative  or  quantitative  criticality  analysis: 

•  Failure  Probability /Failure  Rate  Data  Source.  When  a  qualitative  CA  is 
performed,  failure  modes  are  assessed  in  terms  of  probability  of  occurrence, 
and  the  failure  probability  of  occurrence  level  must  be  shown  in  this 
column.  When  failure  rate  data  are  available,  a  quantitative  CA  can  be 
performed  and  criticality  numbers  may  be  calculated.  In  this  case,  the  data 
source  of  the  failure  rates  used  in  each  calculation  shall  be  listed  in  this 
column.  When  a  failure  probability  is  listed,  the  remaining  columns  are 
not  required. 

•  Failure  Effect  Probability  (6).  The  P  value  is  the  conditional  probability  that 
the  failure  effect  will  result  in  the  identified  criticality  classification,  given 

that  the  failure  mode  occurs.  The  P  value  represents  the  analyst's  judgment 
as  to  the  conditional  probability  the  loss  will  occur  and  should  be  quantified 
in  general  accordance  with  the  values  in  Table  6.1-1. 

TABLE  6.1-1:  TYPICAL  FAILURE  EFFECT  PROBABILITIES  (p) 


FAILURE  EFFECT 

p  VALUE 

Actual  Loss 

Probable  Loss 

Possible  Loss 

No  Effect 

1.00 

>  0.10  to  <  1.00 

>  0  to  0.10 

0 

•  Failure  mode  ratio  (a).  The  fraction  of  the  part  failure  rate  (Ap)  related  to 

the  particular  failure  mode  under  consideration  shall  be  evaluated  by  the 
analyst  and  recorded  here.  The  failure  mode  ratio  is  the  probability 
expressed  as  a  decimal  fraction  that  the  part  or  item  will  fail  in  the  identified 
mode.  RAC  publication  "Failure  Mode/Mechanism  Distributions"  (FMD- 
91)  provides  generic  failure  mode  ratio  (a)  data  such  as  shown  in  Table  6.1-2. 

•  Part  failure  rate  ftp) .  The  part  failure  rate  (Xp)  from  the  appropriate 

reliability  prediction  or  failure  rate  data  source  such  as  RAC  publication 
"Nonelectronic  Parts  Reliability  Data"  (NPRD-91)  shall  be  listed. 
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FIGURE  6.1-2:  CRITICALITY  ANALYSIS  WORKSHEET 
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1  . .  r 


TABLE  6.1-2:  EXAMPLE  FAILURE  MODE  RATIO  (a)  DATA 


Device  Type 

Failure  Mode 

Failure  Mode 
Probability  (a) 

Accumulator 

Leaking 

.47 

Seized 

.23 

Worn 

.20 

Contaminated 

.10 

Actuator 

Spurious  Position  Change 

.36 

Binding 

.27 

Leaking 

.22 

Seized 

.15 

Adapter 

Physical  Damage 

.33 

Out  of  Adjustment 

.33 

Leaking 

.33 

Alarm 

False  Indication 

.48 

Failure  to  Operate  on  Demand 

.29 

Spurious  Operation 

.18 

Degraded  Alarm 

.05 

Antenna 

No  Transmission 

.54 

Signal  Leakage 

.21 

Spurious  Transmission 

.25 

Battery,  Lithium 

Degraded  Output 

.78 

Startup  Delay 

.14 

Short 

.06 

Open 

.02 

Battery,  Lead  Acid 

Degraded  Output 

Short 

■ 

Intermittent  Output 

Battery,  Rechargeable,  Ni-Cd 

Degraded  Output 

.72 

No  Output 

.28 

Bearing 

Binding/Sticking 

.50 

Excessive  Play 

.43 

Contaminated 

.07 

Belt 

Excessive  Wear 

.75 

Broken 

.25 

Operating  time  (t).  The  operating  time  in  hours  or  the  number  of  operating 
cycles  of  the  item  per  mission  is  derived  from  the  system  definition  and 
listed  on  the  worksheet. 


•  Failure  mode  criticality  number  (Cm).  The  value  of  the  failure  mode 
criticality  number  (Cm)  is  calculated  and  listed  on  the  worksheet.  (Cm)  is 
the  portion  of  an  item's  criticality  number  due  to  the  single  failure  mode 
under  investigation  for  a  particular  severity  classification.  For  each 
particular  failure  mode  severity  classification,  the  (Cm)  is  calculated  with 
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the  following  formula: 

C-m  =  PocX.pt 


(6-1) 


where, 

Cm  =  Criticality  number  for  failure  mode 
P  =  Conditional  probability  of  mission  loss 
a  =  Failure  mode  ratio 

Xp  =  Part  failure  rate 

t  =  Duration  of  applicable  mission  phase  usually  expressed  in 
hours  or  number  of  operating  cycles 

•  Item  criticality  numbers  (Cr)  -  The  second  criticality  number  calculation  is 
for  the  item  under  analysis.  Item  criticality  numbers  (Cr)  for  each  system 
item  under  investigation  is  calculated  and  listed  on  the  worksheet.  An  item 
may  be  considered  a  component,  assembly  or  function  depending  on  the 
detail  of  analysis  or  level  of  indenture  which  the  FMECA  is  being 
performed.  For  a  particular  severity  classification  and  mission  phase,  the 
(Cr)  for  an  item  is  the  sum  of  the  failure  mode  criticality  numbers  (Cm), 
under  the  severity  classification  and  may  also  be  calculated  using  the 
following  formula: 

Cr  =  X  (paXpt)  n  =  1,  2,  3,  ...  j  or  Cr  =  £  (Cm)  (6-2) 
n=lv  /n  n=l 


where, 

Cr  =  Criticality  number  for  the  item. 

n  =  The  failure  modes  in  the  items  that  fall  under  a  particular  severity 
classification. 

j  =  Total  number  of  failure  modes  in  the  item  under  the  severity 
classification. 


Other  customized  variations  of  the  FMECA  worksheet  have  been  developed  to 
serve  specific  individual  requirements.  Such  an  example  is  shown  in  Figure  6.1-3. 
This  FMECA  worksheet  contains  eleven  fields  (columns)  of  data  containing 
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information  on  each  part  considered.  These  fields  are  described  as  follows: 

•  Name 

A  label  which  designates  a  particular  part. 

•  I.D.  Number 

An  alphanumeric  code  used  to  designate  the  part. 

•  Function 

A  description  of  the  part's  major  role. 

•  Part  Failure  Rate  (A,p) 

A  numerical  value  having  units  of  failures  per  unit  time.  Normally, 
the  failure  rate  is  obtained  from  a  part  reliability  prediction. 

•  Failure  Mode  Description 

Identifies  the  mode  in  which  the  part  could  fail  (refer  to  Table  3.1-1). 

•  Failure  Mode  Probability 

A  percentage  which  indicates  the  probability  of  occurrence  of  each 
individual  failure  mode.  Same  as  the  failure  mode  ratio. 

•  Local  Effect 

A  description  of  the  immediate  effect  of  failure  by  the  respective  failure 
mode. 

•  End  Effect 

A  description  of  the  end  effect  (created  by  the  respective  failure  mode) 
resulting  from  the  propagation  of  failures  through  the  system. 

•  Failure  Classification 

A  code  that  identifies  a  level  of  criticality  for  each  individual  failure 
mode.  Normally,  the  failure  classification  is  relative  to  the  system 
performance.  Equivalent  to  the  severity  classification. 

•  Modal  Failure  Rate 

A  numerical  value  that  is  the  product  of  the  part  failure  rate  and  the 
failure  mode  probability.  The  modal  failure  rate  along  with  the  failure 
classification  are  used  to  assess  the  criticality  of  each  individual  failure 
mode. 

•  Comments 

Any  other  relevant  items  are  listed  in  this  column.  Items  such  as 
references,  duty  cycle,  Weibull  shape  parameter  and  lubricant  are 
examples. 
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The  information  placed  in  the  eleven  columns  of  Figure  6.1-3  may  include,  for 
example: 

Part  Characteristics 

Name:  Oil  Seal  (Radial  Lip) 

I.D.  Number:  643293 

Function:  Prevents  the  flow  of  lubricant  at  the  interface  between  the 

cutting  cylinder  shaft  and  frame. 

Failure  Rate:  .80  (failures/ 106  part  hours) 

Failure  Mode 

Description:  Leakage 

Probability  of 

Occurrence:  .75 

Effects  of  Failure 

Local  Effect:  Lubricant  is  lost  from  tapered  roller  bearing  cavity. 

End  Effect:  Abrasives  collect  at  lip  seal  and  shaft  interface  and  accelerated 

wear  of  both. 

Failure  Classification:  Degraded  System  Operation  (DSO) 

Modal  Failure  Rate:  .60  (failures/ 106  part  hours) 

Comments:  Failure  rate  based  on  a  shaft  speed  of  1725  rpm. 

The  "automated  FMECA"  offers  significant  improvement  over  the  manual 
FMECA.  An  automated  FMECA  can  provide  the  necessary  traceability  from  design 
elements  to  the  failure  effects  as  well  as  from  failure  effects  to  the  design  elements. 
This  dual  line  of  traceability  was  not  readily  obtainable  in  the  original  "manual 
tabular"  method.  The  automated  technique  should  utilize  a  clear  and 
straightforward  approach  to  cross-referencing  between  design  elements  (e.g.,  parts) 
and  failure  effects.  This  allows  the  practical  use  of  the  FMECA  by  those  involved  in 
various  disciplines  (e.g.,  reliability,  maintenance,  design,  management,  etc.)  relating 
to  product  development. 
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When  performing  an  automated  FMECA,  the  database  and  report  generator 
should  be  structured  so  that  (at  least)  two  major  listings  are  generated  as  output, 
namely  the  "design  element"  listing  and  the  "effects"  listing.  An  example  of  each 
listing  is  shown  in  Figure  6.1-4  and  Figure  6.1-5.  Lind  (Reference  [71])  described  each 
of  these  listings  as  follows: 


The  Design  Element  Listing 

"The  design  element  listing  identifies  the  design  elements  and 
their  failure  modes  in  the  order  that  the  analysis  was  coded. 
Comments  can  also  be  included  in  this  listing.  To  allow  space  for 
additional  FMEA  data,  the  effect  descriptions  are  not  printed  but 
instead  are  referenced  by  effect  number.  Listing  by  design  elements 
allows  verification  that  all  failure  modes  of  all  design  elements  were 
considered  in  the  analysis.  Failure  mode  causes  preventive  measures 
can  be  included  in  this  listing." 

The  Effect  Listing 

"The  effect  listing  identifies  the  effects  with  the  design  element  and 
applicable  failure  modes  that  could  cause  those  effects.  The  effect 
listing  provides  a  convenient  method  of  identifying  all  pertinent 
information  such  as  design  elements,  failure  modes,  criticality,  and 
failure  rates  that  should  be  considered  in  the  evaluation  of  the  effects." 

It  is  interesting  to  note,  that  the  information  contained  in  the  effect  listing 
shown  in  Figure  6.1-5  would  provide  an  excellent  troubleshooting  document  for  a 
system.  All  that  is  required  is  to  adopt  a  standard  set  of  end  effects  at  the  system 
level.  Many  FMECAs  performed  do  not  take  advantage  of  this  highly  useful  feature 
of  the  FMECA. 

6.2  Fault  Tree  Analysis  (FTA) 

Fault  Tree  Analysis  (FTA)  is  a  failure  mode  evaluation  technique  that  can  be 
applied  to  any  system.  This  technique  (1)  identifies  undesired  event(s)  (system 
failure  event),  (2)  graphically  represents  all  levels  of  subevents  and  causes  which  are 
relative  to  creating  the  undesired  event,  and  (3)  qualitatively  and/or  quantitatively 
assesses  the  occurrence  (unoccurrence)  of  the  undesired  event. 
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Design  Element  Listing 


Line 

Number 

Design 

Element 

Design  Element 
I.D.  Number 

Failure 

Mode 

End  Effect 
Number 

Modal  Failure 
Rate 

i 

O-Ring 

0-1 

Leakage 

E001 

.300 

4 

Spur  Gear 

G-l 

Broken  Tooth 

E001 

.002 

8 

Ball  Bearing 

B-2 

Misalignment 

E001 

.010 

FIGURE  6.1-4:  THE  MAJOR  HEADINGS  IN  THE  DESIGN  ELEMENT  LISTING 


End 

Effect 

Number 

End  Effect 
Description 

Criticality 

Level 

Total  Modal 
Failure  Rate 

Design 

Element 

Design 
Element 
I.D.  Number 

Failure 

Mode 

Modal 

Failure 

Rate 

Line 

Number 

E001 

Loss  of  Actuation 

CSF 

.312 

O-Ring 

0-1 

Leakage 

.300 

1 

Ball  Bearing 

B-l 

Misalignment 

.010 

8 

Spur  Gear 

G-l 

Broken  Tooth 

.002 

4 

FIGURE  6.1-5:  THE  MAJOR  HEADINGS  IN  THE  EFFECTS  LISTING 

FTA  provides  an  alternate  failure  mode  evaluation  technique  to  the  FMECA 
discussed  in  Section  6.1.  Recall  that  the  FMECA  starts  at  the  lowest  level  of  system 
configuration  whereas  the  FTA  begins  at  the  top  level.  This  is  the  primary 
difference  in  their  approach  to  evaluating  the  failure  modes  of  a  system. 

Fault  tree  construction  begins  with  the  identification  of  the  undesired  system 
failure  event  that  forms  the  top  event  of  the  fault  tree.  The  top  event  is  then  linked 
to  its  immediate  causes  and/or  sub-events  using  the  appropriate  logic  symbols 
shown  in  Figure  6.2-1.  Linking  of  sub-events  and  causes  continues  in  turn  until  the 
desired  basic  cause  level  is  reached. 
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OR  Gate 

Shows  a  logic  OR  relationship.  The  output  event  will 
happen  if  and  only  if  one  or  more  of  the  input  events 
happen. 

AND  Gate 

Shows  a  logical  AND  relationship.  The  output  event 
will  happen  if  and  only  if  all  of  the  input  events 
happen. 


Intermediate  Event 

Describes  a  condition  or  event  that  is  caused  by  the 
combination  of  fault  causes  through  the  input  gate. 

Basic  Event 

Shows  the  failure  of  an  elementary  component.  No 
further  development  is  needed. 


Undeveloped  Event 

Shows  an  input  event  which  could  be  developed 
further  if  more  information  were  available.  Could  be 
used  to  show  the  failure  of  a  subsystem  which  is 
considered  to  be  a  basic  input  event. 


Transfer  Symbol 

Shows  where  one  branch  of  the  fault  tree  connects  to 
another  branch,  and  often  used  from  one  sheet  to 
another.  Connection  to  the  apex  means  input  to  the 
tapped  event;  connection  to  the  side  means  output 
from  the  tapped  event.  Upside  down  means  the 
transferred  branch  is  similar  to  the  tapped  event.  The 
number  in  the  triangle  identifies  the  connections. 


INHIBIT  Gate 

Shows  that  the  output  event  will  only  occur  if  the 
input  event  happens  in  such  a  way  that  some 
restriction  is  satisfied. 

Combination  (COM)  Gate 

Shows  that  the  output  event  will  happen  if  any  r  out  of 
the  n  input  events  happen. 


FIGURE  6.2-1:  SYMBOLS  USED  IN  FTA 
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As  an  example,  a  fault  tree  was  developed  for  a  torque  wrench  design  which  is 
shown  in  Figure  6.2-2.  The  undesired  event  was  specified  as  "the  wrench  applies 
the  wrong  torque."  The  fault  tree  resulting  from  this  undesired  event  is  shown  in 
Figure  6.2-3. 


Adjusting  screw 
Handle 


Ratchet 
Drive  pin 

Ratchet  spring 


Thrust  washer 
‘Retaining  ring 

Torque  spring 
End  cap 


Drive  pin 
Reversing  collar 
Reversing  spring 


FIGURE  6.2-2:  TORQUE  WRENCH  DESIGN  [67] 

Evaluation  of  the  fault  tree  may  be  qualitative  or  quantitative.  Qualitative 
analysis  determines  the  combinations  of  basic  failures  that  lead  to  the  undesired 
event  and  the  combinations  of  the  complementary  events  to  assure  that  the  top 
event  will  not  occur.  A  qualitative  analysis  is  usually  carried  out  when  no 
elementary-level  reliability  data  is  available  or  where  the  primary  objectives  of  the 
FTA  can  be  fulfilled  without  quantifying  the  results. 

In  a  quantitative  analysis  the  probability  of  occurrence  of  the  top  event  is 
calculated  using  the  quantitative  information  of  the  basic  events.  Various 
quantitative  analysis  methods  have  been  discussed  in  the  literature  (References  [54], 
[59]  and  [65])  for  evaluating  the  fault  tree. 
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One  possible  method  for  quantitatively  assessing  the  probability  of  occurrence  of 
the  top  event  is  by  propagating  the  probabilities  of  occurrence  of  the  basic  events 
upward  through  the  fault  tree.  This  method  requires  that  the  probability  of 
occurrence  of  all  basic  events  is  known. 

The  combining  of  these  probabilities  follows  the  basic  laws  of  probability  given  in 
Appendix  B.  The  AND  gate  (refer  to  Figure  6.2-4  and  Rule  3  of  Appendix  B)  dictates 
that  the  following  probability  rule  be  applied  to  obtain  the  AND  gate  output 
probability  (assuming  all  fault  events  are  independent): 

Pr(Ai  and  A2  and ...  A)  =  ft  [Pr(Ai)]  =  Pr(A|)  Pr(A2)...Pr(An)  (6-3) 


Pr(Aj)  Pr(A2)  Pr(A3)  ...  PtfA^)  PifA^j)  Pr^) 
FIGURE  6.2-4:  AND  GATE  INPUT,  OUTPUT 
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The  OR  gate  (refer  to  Figure  6.2-5  and  Rule  5  or  7  of  Appendix  B)  requires  that 
one  of  the  following  probability  rules  be  applied  to  obtain  the  OR  gate  output 
probability: 

Inputs  mutually  exclusive: 

n 

Pr(Ai  or  A2  or . . .  An)  =  £  Pr(Ai)  (6-4A) 

i=l 

Inputs  not  mutually  exclusive: 

Pr(A1and/or  A2  and/or...  and/orAn)  =  1  -  II  [l  -  Pr(Ai)]  (6-4B) 


where, 

Pr(Ai)  =  probability  of  occurrence  of  the  ith  input  event  (an  input  event  may 
be  represented  as  an  event  block,  basic  component  fault  or 
undeveloped  event  which  are  shown  in  Figure  6.2-1) 

n  =  number  of  input  events  which  may  or  may  not  be  mutually 
exclusive 


PrfAj  or  A2  or ...  An) 

(input  events  mutually  exclusive) 


FIGURE  6.2-5:  OR  GATE  INPUT,  OUTPUT 
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Consider  as  an  example  the  coolant  supply  system  shown  in  Figure  6.2-6  from 
which  the  fault  tree  shown  in  Figure  6.2-7  is  developed.  The  undesired  event  is 
identified  as  a  loss  of  minimum  flow  to  the  heat  exchanger.  When  the  propagation 
of  the  probabilities  of  occurrence  is  applied  to  the  fault  tree  of  Figure  6.2-7,  orderly 
calculations  are  performed  to  determine  the  probability  of  occurrence  of  the 
undesired  event.  The  calculation  procedure  is  initiated  at  the  elementary  level  of 
the  fault  tree.  The  output  of  each  gate  is  then  determined  from  the  gate  type  and  the 
gate  inputs.  Equations  (6-3)  and  (6-4)  are  applied  in  the  following  sequence  to 
determine  the  probability  of  occurrence  of  the  undesired  event.  The  notation 
follows  that  identified  in  Figure  6.2-7. 

1)  Pr(X8)  -  1  -  [1  -  Pr(X]4)]  [1  -  Pr(X15)]  [l  -  Pr(XI6)] 

2)  Pr(X5)  =  1  -  [1  -  Pr(Xg)]  [l  -  Pr(X9)] 

3)  Pr(X7)  =  Pr(X12)  Pr(X]3) 

4)  Pr(X4)  =  1  -  [l  -  Pr(X7)]  [l  -  Pr^)] 

5)  Pr(X2)  =  1  -  fl  -  Pr(X3)j  [l  -  Pr(X4)j  [l  -  Pr(X5)] 

6)  Pr(A)  =  1  -  [1  -  Pr(X2)j  [l  -  Pr(X,)] 


FIGURE  6.2-6:  COOLANT  SUPPLY  SYSTEM  [58] 
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FIGURE  6.2-7:  FAULT  TREE  FOR  COOLANT  SUPPLY  SYSTEM  [58] 
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7.0  CONCLUSION 

It  is  important  that  engineering  and  management  personnel  develop  an 
awareness  of  the  quantitative  mechanical  reliability  evaluation  tools  which  are 
available  today.  Reliability  evaluation  techniques  represent  useful  design  tools  and 
management  aides  for  improving  overall  system  performance.  Mechanical 
Applications  In  Reliability  Engineering  represents  a  serious  effort  to  collect  and 
present  only  those  practical  mechanical  part  and  system  reliability  evaluation 
techniques  which  have  proven  useful  application.  Many  other  valuable  reliability 
techniques  currently  exist  which  have  not  been  discussed  here.  We  strongly 
encourage  analysts  to  keep  a  constant  vigal  to  discover  and  effectively  utilize  all  of 
these  new  tools. 

Reliability  engineering  has  grown  from  a  need  by  engineering  and  management 
personnel  to  quantitatively  examine  the  "quality  over  the  long  run"  of  parts  and 
systems.  As  the  quantity  of  historical  mechanical  reliability  data  increases  and  is 
analyzed,  the  more  effective  mechanical  reliability  evaluation  techniques  can 
become  in  improving  engineering  designs.  To  make  the  current  models  more 
accurate,  it  is  extremely  important  that  mechanical  reliability  data  (e.g.,  times-to- 
failure,  failure  mode  probability  of  occurrence,  probabilistic  material  characteristics 
and  system  failure  process  characteristics)  be  collected  for  statistical  analysis.  The 
Reliability  Analysis  Center  maintains  an  extensive  mechanical  reliability  data 
base /library  where  mechanical  reliability  data  is  collected,  analyzed  and 
summarized.  The  RAC  encourages  all  organizations  to  utilize  and/or  contribute  to 
this  growing  source  of  information. 

Research  and  development  of  new  engineering  approaches  to  improve 
part/ system  performance  is  continually  being  developed  in  such  areas  as: 

•  finite  element  analysis 

•  cumulative  damage  analysis 

•  fatigue  analysis 

•  optimization  methods 

•  probabilistic  design 

•  system  reliability 
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In  order  for  us  to  effectively  apply  the  theories  and  tools  generated  from  these 
state-of-the-art  research  areas  to  evaluate  and  improve  the  reliability  of  parts  and 
systems,  we  must  understand  basic  reliability  practices  and  continue  to  be  informed. 
This  is  accomplished  by  reading  reliability  periodicals,  attending  R&M 
conferences /symposiums,  and  ultimately  evaluating  the  thoughts  of  the  "many" 
until  personal  understanding  is  achieved.  In  this  endeavor  we  wish  you  the  best  of 
luck! 
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A-l 


a  (Alpha) 

Assessment 

P  (Beta) 

Confidence  Coefficient 

Confidence  Interval 

Confidence  Limit 
Constant  Failure  Rate 

Criticality 

Criticality  Analysis  (CA) 

Cumulative  Distribution 
Function,  F(x) 


The  characteristic  life  of  the  Weibull  distribution. 
63.2%  of  the  lifetimes  will  be  less  than  the 

characteristic  life  regardless  of  the  value  of  p,  the 
Weibull  slope  parameter. 

The  use  of  test  data  and/or  operational  service  data  to 
form  estimates  of  population  parameters  and  to 
evaluate  the  precision  of  those  estimates  (synonym  - 
Estimation). 

The  parameter  of  the  Weibull  distribution  that 
determines  its  shape  and  that  implies  the  failure 
mode  characteristic  (infant  mortality,  random,  or 
wearout).  It  is  also  called  the  slope  parameter  because 
it  is  estimated  by  the  slope  of  the  straight  line  on 
Weibull  probability  paper. 

A  measure  of  assurance  that  a  statement  based  upon 
statistical  (frequency)  data  is  correct.  The  probability 
that  an  unknown  parameter  lies  within  a  stated 
interval  or  is  greater  than  or  less  than  some  stated 
value. 

A  region  within  which  an  unknown  parameter  is 
said  to  lie  with  stated  probability.  The  region  is  two- 
sided  when  both  upper  and  lower  limits  are  specified. 
It  is  one-sided  when  only  the  upper  or  the  lower 
limit  is  specified. 

A  bound  of  a  confidence  interval. 

Characterizes  a  part  with  constant  Hazard  Rate,  h(t). 

h(t)  =  constant  =  A, 

A  relative  measure  of  the  consequences  of  a  failure 
mode  and  its  frequency  of  occurrence. 

A  procedure  by  which  each  potential  failure  mode  is 
ranked  according  to  the  combined  influences  of 
severity  and  probability  of  occurrence. 

The  probability  F(x)  that  a  random  variable  X  takes  a 
value  less  than  or  equal  to  x. 
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Decreasing  Force  of 
Mo  tali  ty 

Evaluation 

Exponential  Distribution 

Failure 

Failure  Effect 

Failure  Mechanism 
Failure  Mode 

Failure  Modes  and  Effects 
Analysis  (FMEA) 

Hazard  Rate,  h(t) 

Increasing  Force  of 
Mortality 

Infant  Mortality 


Characterizes  a  part  with  decreasing  hazard  rate.  This 
may  occur,  for  instance,  during  the  early  portion  of 
part  life  as  indicated  by  the  "bath  tub"  curve  for  parts. 

A  broad  term  used  to  encompass  prediction, 
measurement,  and  demonstration. 

A  probability  distribution  having  the  density 
function  f(x)  =  Xe'^x  where  X  is  constant. 

Performance  below  a  specified  minimum  level  or 
outside  a  specified  tolerance  interval. 

The  consequence(s)  a  failure  mode  has  on  the 
operation,  function  or  status  of  an  item.  Failure 
effects  are  classified  as  primary  or  local  effects, 
secondary  or  next  -  level  effects  and  system  or  end 
effects. 

A  description  of  the  failure  process. 

The  manner  by  which  a  failure  is  observed. 
Generally  describes  the  way  the  failure  effects  the 
function  of  a  part. 

A  procedure  by  which  each  potential  failure  mode  in 
a  system  is  analyzed  to  determine  the  results  or 
effects  on  the  system  and  to  classify  each  potential 
failure  mode  according  to  its  severity. 

Also  called  the  force  of  mortality  (FOM)  represents 
the  probability  that  an  item  still  functioning  at  time  x 

will  fail  in  the  interval  (x,  x  +  Ax),  where  Ax  is  an 
infinitesimal  time  increment.  The  hazard  rate  is  not 
a  density  function.  The  hazard  rate  is  defined  by: 
f(x) 

1  -  F(x) 

Characterizes  a  part  with  increasing  hazard  rate.  This 
may  occur,  for  instance,  during  the  later  portion  of 
part  life  as  indicated  by  the  "bath  tub"  curve  for  parts. 

A  failure  mode  characterized  by  a  hazard  rate  that 
decreases  with  age,  i.e.,  new  units  are  more  likely  to 
fail  than  old  units. 
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Indenture  Levels 


Item 


Log  Normal  Distribution 


Mean,  p 


Mean-Time-Between- 
Failure  (MTBF) 

Mean-Time-To-Failure 

(MTTF) 


Monte  Carlo  Simulation 


Normal  Distribution 


The  item  levels  which  identify  or  describe  relative 
complexity  of  assembly  or  function.  The  levels 
progress  from  more  complex  (major  system)  to  the 
simpler  (part)  divisions. 

It  is  a  nonspecific  term  used  to  denote  any  product, 
including  systems,  subsystems,  sets,  groups, 
assemblies,  subassemblies,  parts,  materials, 
accessories,  and  so  forth. 

Statistical  distribution  which  characterizes  times  to 
failure  of  terms  displaying  normally  distributed 
logarithms  of  times  to  failure. 

The  first  moment  of  a  probability  distribution  about 
its  origin,  the  expected  value  of  a  random  variable. 
The  mean  is  the  most  commonly  used  measure  of 
central  tendency.  Estimated  by  an  arithmetic  average. 

A  basic  measure  of  reliability  for  repairable  systems 
which  follow  the  Homogeneous  Poisson  Process. 


The  mean  time  to  failure.  If  fP(x)  is  the  Probability 
Density  Function  of  random  variable,  P,  then  MTTF 


=  f°°  xfp(x)  dx. 

J  n 


A  mathematical  model  of  a  system  with  random 
elements,  usually  computer-adapted,  whose  outcome 
depends  on  the  application  of  randomly  generated 
numbers. 


The  most  prominent  continuous  distribution  in 
statistics,  frequently  referred  to  as  the  Gaussian  or 
bell-shaped  distribution.  Its  density  function  is 


f(x)  = 


<wsexp 


(*  - 11)2 
2  a2 


CO  <  X  <  00 


with  mean,  p,  and  variance,  a2.  The  theoretical 
justification  for  the  normal  distribution  lies  in  the 
central-limit  theorem,  which  shows  that  under  very 
broad  conditions  the  distribution  of  the  average  of  n 
independent  observations  from  any  distribution 
approaches  a  normal  distribution  as  n  becomes  large. 
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Normal  Variable 
Part  Failure  Rate,  Ap 

Probability  Density 
Function,  f(x) 

Random 

Random  Variable 

Redundancy 

Redundancy,  Active 
Redundancy,  Standby 

Reliability 

Reliability  Engineering 


A  random  variable  that  is  normally  distributed. 

The  constant  hazard  rate  of  parts  whose  failure 
model  is  exponential. 

It  is  a  continuous  function  of  a  random  variable,  X, 

such  that  its  integral  fb  f(x)  dx  represents  the 

probability  of  x  assuming  a  value  between  a  and  b. 
The  integral  over  all  x  is  equal  to  1. 

An  event  that  is  independent  of  time,  in  the  sense 
that  an  old  unit  is  as  likely  to  fail  as  a  new  unit.  In 
other  words,  the  hazard  rate  remains  constant  with 
age. 

An  output  of  an  experiment  which  may  take  any  of 
the  values  of  a  specified  set  with  a  specified  relative 
frequency  or  probability. 

The  existence  of  more  than  one  means  for 
accomplishing  a  given  function.  All  means  of 
accomplishing  the  function  need  not  necessarily  be 
identical. 

The  redundancy  wherein  all  redundant  items  are 
operating  simultaneously. 

That  redundancy  wherein  the  alternative  means  of 
performing  the  function  is  inoperative  until  needed 
and  is  switched  on  upon  failure  of  the  primary 
means  of  performing  the  function. 

The  probability  of  failure  free  operation  over  a  time 
interval  under  stated  conditions  given  that  it  is 
operable  at  the  beginning  of  the  interval. 

A  professional  discipline  which  combines  knowledge 
of  statistics  and  engineering  for  the  purpose  of 
quantitatively  evaluating,  prediction,  measuring  and 
improving  the  reliability  of  human  products. 
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Variance,  o2 


Wearout  (of  parts) 


Weibull  Distribution 


The  second  moment  about  the  mean  of  a  probability 
distribution.  A  measure  of  the  dispersion  of  random 
variable  about  its  mean  value.  In  testing  variance  is  a 
measure  of  random  errors  in  a  series  of 
measurements. 


A  part  failure  mode  characterized  by  a  hazard  rate 
that  increases  with  age,  i.e.,  old  parts  are  more  likely 
to  fail  than  new  parts. 

The  statistical  distribution  modeled  by  the  probability 
density  function: 


f(x)  =  -tt  x^'1  exp 
ap 


x>0,  a>0,  (3  >  0 


The  function  was  introduced  in  1951  by  W.  Weibull 
on  empirical  grounds  based  on  studies  of  material 
strength. 
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B-l 


Rule  1: 

If  Pr(A)  and  Pr(A)  represent  respectively  the  probability  of  the  event  A 
occurring  and  not  occurring,  then: 

Pr(A)  =  1  -  Pr(A)  (B.l) 


Rule  2: 

If  A  and  B  are  two  independent  events,  then  the  probability  that  both  A  and  B 
will  happen,  known  as  their  joint  probability,  is  the  product  of  their  respective 
individual  probabilities  -  that  is: 

Pr(A  and  B)  =  Pr(AB)  =  Pr(A)  Pr(B)  (B.2) 


Rule  3: 


The  probability  of  the  joint  occurrence  of  each  of  N  independent  events 
Aj,  A2,  ...,  An  is  the  product  of  their  individual  probabilities  -  that  is: 


x  fN  ^ 

Pr(Aj  and  A2  and  ...  An)  =  Pr  nAj 

vi=l  ) 


Pr(A1)  Pr(A2 ) ...  Pr(AN) 


(B.3) 


Rule  4: 

If  A  and  B  are  two  mutually  exclusive  events  -  that  is,  Pr(AB)  =  0  -  then  the 
probability  that  one  of  these  two  events  will  take  place  is  given  by  the  sum  of  their 
individual  probabilities: 

Pr(AorB)  =  Pr(A  +  B)  =  Pr(A)  +  Pr(B)  (B.4) 
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Rule  5: 

The  probability  of  occurrence  of  one  of  N  mutually  exclusive  events 
Ai,  A2/  An  is: 


f  N  'N 

Pr(Aj  or  A 2  or ...  AN)  =  Pr  X  Aj 

Vi=l 


N 

X  Pr(Aj) 
i=l 


(B.5) 


Rule  6: 


If  A  and  B  are  two  events  that  are  not  necessarily  mutually  exclusive  -  that  is,  if 
Pr(AB)  *  0  -  then  the  probability  that  at  least  one  of  these  two  events  will  take  place 
is  given  by  the  sum  of  their  individual  probabilities  less  their  joint  probability: 

Pr(A  and  /  or  B)  =  Pr(A  +  B)  =  Pr(A)  +  Pr(B)  -  Pr(AB)  (B.6) 


Rule  7: 


The  probability  that  at  least  one  of  N  events  A A2/  ...,  Ajsj  will  take  place  is: 


N  A 

Pr(Aj  and  / or  A2  and  /  or  ...  AN)  =  Pr  X Aj 

Vi=l  ) 


(B.7) 


=  1  -  Pr  (Ai  A2  ...  An) 


N 


1  -  n  [1  -  Pr(Ai)] 

i=l 
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D-l 


This  appendix  presents  important  descriptive  statistics  for  both  the  population 
and  sample.  The  population  is  defined  as  a  collection  of  objects  having  common 
characteristics.  The  population  is  typically  viewed  as  the  universe  or  collection  of 
all  such  objects.  The  parameters  of  the  population  are  fixed,  though  usually 
unknown.  A  sample  is  a  subset  of  the  population.  The  characteristics  of  a  sample 
can  vary  from  sample  to  sample  even  though  extracted  from  the  same  population. 
These  are  two  conceptually  significant  items  when  discussing  descriptive  statistics. 

A.  Measures  of  Central  Tendency 

Mean: 

Population  Mean  (continuous  random  variable):  |i  =  E(x)  =  J  ^  x  f(x)dx 

n 

X  xi 

Sample  Mean:  x  =  1=^ — 

Median: 

Population  Median:  0.5  =  P50  f(x)dx  or  F(x)  =  0.5 

*  — oo 


Sample  Median: 

1)  The  middle  number  from  order  statistics  when  the  sample  size  is 
odd 

2)  The  average  of  the  two  middle  numbers  from  order  statistics  when 
the  sample  size  is  even 

Mode: 

Population  Mode:  =  0 

Sample  Mode:  Value  of  Random  Variable  =  Max  (Prob.  of  occur.) 
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B.  Measures  of  Variability 
Variance: 

Population  Variance  (continuous  random  variable):  a2  =  J°°  (x  -  p)2  f(x)dx 


Sample  Variance: 


n-1 


(unbiased  form) 


Standard  Deviation: 


Population  Standard  Deviation  (continuous  random  variable): 


Sample  Standard  Deviation:  S  =  Vs2" 
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Gamma  Function 


r(x)  =  J~  tx "  ^e_tdt  for  1  <  x  <  2 

[For  other  values  use  the  formula  T(x  +  1)  =  x  T(x)] 


X 

rM 

1.00 

1.00000 

1.01 

.99433 

1.02 

.98884 

1.03 

.98355 

1.04 

.97844 

1.05 

.97350 

1.06 

.96874 

1.07 

.96415 

1.08 

.95973 

1.09 

.95546 

1.10 

.95135 

1.11 

.94740 

1.12 

.94359 

1.13 

.93993 

1.14 

.93642 

1.15 

.93304 

1.16 

.92980 

1.17 

.92670 

1.18 

.92373 

1.19 

.92089 

1.20 

.91817 

1.21 

.91558 

1.22 

.91311 

1.23 

.91075 

1.24 

.90852 

1.25 

.90640 

1.26 

.90440 

1.27 

.90250 

1.28 

.90072 

1.29 

.89904 

1.30 

.89747 

1.31 

.89600 

1.32 

.89464 

1.33 

.89338 

1.34 

.89222 

1.35 

.89115 

1.36 

.89018 

1.37 

.88931 

1.38 

.88854 

1.39 

.88785 

1.40 

.88726 

1.41 

.88676 

1.42 

.88636 

1.43 

.88604 

1.44 

.88581 

1.45 

.88566 

1.46 

.88560 

1.47 

.88563 

1.48 

.88575 

1.49 

.88595 

X 

I’M 

.88623 

.88659 

.88704 

.88757 

1.54 

.88818 

1.55 

.88887 

1.56 

.88964 

1.57 

.89049 

1.58 

.89142 

1.59 

.89243 

1.60 

.89352 

1.61 

.89468 

1.62 

.89592 

1.63 

.89724 

1.64 

.89864 

1.65 

.90012 

1.66 

.90167 

1.67 

.90330 

1.68 

.90500 

1.69 

.90678 

1.70 

.90864 

1.71 

.91057 

1.72 

.91258 

1.73 

.91467 

1.74 

.91683 

1.75 

.91906 

1.76 

.92137 

1.77 

.92376 

1.78 

.92623 

1.79 

.92877 

1.80 

.93138 

1.81 

.93408 

1.82 

.93685 

1.83 

.93969 

1.84 

.94261 

1.85 

.94561 

1.86 

.94869 

1.87 

.95184 

1.88 

.95507 

1.89 

.95838 

1.90 

.96177 

1.91 

.96523 

1.92 

.968 77 

1.93 

.97240 

1.94 

.97610 

1.95 

.97988 

1.96 

.98374 

1.97 

.98768 

1.98 

.99171 

1.99 

.99581 

2.00 

1.00000 
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RAC  Product  Order  Form 


Ordering 

Code 


Title 


U.S. 

Price 


Non-US  Qty. 
Price 


Item 

Total 


ITCE 

Introduction  to  Concurrent  Engineering 

■3Z3 

$85 

WCCA 

Worst  Case  Circuit  Analysis  Application  Guidelines 

mx-m 

$85 

FMECA 

Failure  Mode.  Effects  and  Criticality  Analysis 

MEM 

$85 

FT  A 

Failure  Mode/Mechanism  Distributions 

$100 

$120 

FCDS 

Environmental  Characterization  Device  Sourcebook 

$100 

$120 

Nonoperating  Reliability  Databook  *Price  Reduced*  _ _ 

$50 

$60 

Electrostatic  Discharge  Susceptibility  Data  . 

$150 

$170 

■  i  i  ii  i  — 

Microcircuit  Screening  Analysis  ‘Price  Reduced* 

$50 

$60 

n^piivuiivii  v 

RMST 

Reliability  &  Maintainability  Software  Tools 

$50 

$60 

SLEA 

Service  Life  Extension  Assessment 

$50 

SOAR-6 

ESD  Control  in  the  Manufacturing  Environment  Trice  Reduced* 

$50 

HI 

SOAR-4 

Confidence  Bounds  for  System  Reliability 

TEST 

Testability  Design  and  Assessment  Tools 

TC-Ei 

«!!■ 

PRIM 

A  Primer  for  DoD  Reliability,  Maintainability,  Safety  &  Logistic  Standards 
‘Price  Reduced* 

$100 

$120 

NPS 

Mechanical  Applications  in  Reliability  Engineering 

$100 

$120 

QREF 

RAC  Quick  Reference  Guides  Trice  Reduced* 

gm 

$35 

RMIMP 

R&M  ImDlications  of  Current  DoD  Acquisition  Policy/Procedures 

$60 

TOOLKIT 

RL  Reliability  Engineer's  Toolkit  -  2nd  Edition 

RDSC 

The  Reliability  Sourcebook  -  "How  and  Where  to  Obtain  R&M  Data  and 
Information"  — 

$50 

| 

SOAR-2 

Practical  Statistical  Analysis  for  the  Reliability  Engineer 

$50 

$60 

Component  1 

Publications 

MFAT-1 

Microelectronics  Failure  Analysis  Techniques:  A  Procedural  Guide 
‘Price  Reduced* 

$70 

$80 

ATH 

Analoq  Testing  Handbook 

$100 

$120 

PEM 

Plastic  Microcircuit  Packages:  A  Technology  Review 

$50 

$60 

GAAS 

An  Assessment  of  Gallium  Arsenide  Device  Quality  and  Reliability 

MFAT-2 

GaAs  Microcircuit  Characterization  and  Failure  Analysis  Techniques: 

A  Procedural  Guide  ‘Price  Reduced* 

$50 

$60 

MFAT  1  &  2 

Combined  set  of  MFAT-1  and  MFAT-2  ‘Price  Reduced* 

$100 

$120 

QML 

Qualified  Manufacturer's  List:  New  Device  Manufacturing  and  Procurement 
Technique 

$50 

$60 

Reliable  App 

lication  of  Components 

PSAC 

Parts  Selection,  Application  and  Control 

$75 

$85 

CAP 

Reliable  Application  of  Capacitors 

$60 

HYB 

Reliable  Application  of  Hybrids 

$50 

$60 

SOAR-7  1 

A  Guide  for  Implementing  Total  Quality  Management 

$75 

$85 

TQM 

TQM  Toolkit 

$75 

$85 

SOAR-8 

Process  Action  Team  Handbook  ‘Price  Reduced* 

$40 

$50 

NPRD-P 

NPRD-91  PC  Version 

$400 

$440 

VZAP-P 

VZAP-91  PC  Version 

$400 

$440 

NRPS 

Nonoperatinq  Reliability  Prediction  System  *Price  Reduced* 

$700 

$740 

VPRED 

VHSIC  Reliability  Prediction  Software  ‘Price  Reduced* 

$100 

$120 

217N1 

MIL-HDBK-217F,  Notice  1 

217N2D 

MIL-HDBK-21 7F.  Notice  2  (Draft) 

KS9 

■KiiM 

338D 

MIL-HDBK-338B  (Draft) 

$95 

$105 

Shipping  and  Handling  •  See  Right 
Quantity  Discount  -  See  Right 

Please  Make  Checks  Payable  to  IITRI/RAC 

Order  Total 

Ordering  Information . . . . 

Ordering 

Fax  to  (315)  337-9932  or  mail  to  Reliability  Analysis  Center,  P.O.  Box  4700,  Rome,  NY,  13442-4700.  Prepayment  is 
preferred.  Credit  cards  (VISA,  AMEX,  MasterCard)  are  accepted  for  purchases  of  $25  and  up.  All  Non-US  orders  must 
be  accompanied  by  a  check  drawn  on  a  US  bank. 

Shipping  &  handling 

US  orders  add  $3.00  per  book,  $4.00  for  First  Class.  Non-US  add  $5.00  per  book  for  surface  mail,  $15.00  per  book  for 
air  mail. 

Quantity  discounts 

Discounts  are  available  for  1 0+  copies.  For  discount  prices  call  (800)  526-4802  or  (315)  339-7047. 

Military  agencies 

Blanket  Purchase  Agreement,  DD  Form  1155,  may  be  used  for  ordering  RAC  products  and  services.  Indicate  the 
maximum  amount  authorized  and  cutoff  date  and  specify  products  and  services  to  be  provided.  Identify  vendor  as  IIT 
Research  Institute/Reliability  Analysis  Center. 

To  place  an  order 

Write  to:  Reliability  Analyis  Center,  P.O.  Box  4700,  Rome,  NY  1 3442-4700 
Call:  (800)  526-4802,  (315)  339-7047 

Fax:  (315)  337-9932 


Please  return  with  RAC  Product  Order  Form 

Name  _ 

Company  _ 

Division  _ 

Address _ 

City _ State _ Zip  _ 

Country  _ Phone  Ext. 

Method  of  Payment  - .  - 

□  Personal  check  enclosed 

□  Company  check  enclosed 

□  Credit  Card  # _  Expiration  Date _ 

Type  (circle):  AMERICAN  EXPRESS  VISA  MASTERCARD 

Name  on  card:  _ _ 

Billing  Address:  _ 


□  DD1 155  (Government  personnel) 

□  Company  Purchase  Order 

□  Place  my  name  on  the  distribution  list  for  the  free  RAC  Journal 


